modelscope
diff --git a/‎README.md‎
Lines changed: 2 additions & 1 deletion b/‎README.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎README_CN.md‎
Lines changed: 2 additions & 1 deletion b/‎README_CN.md‎
Lines changed: 2 additions & 1 deletion
diff --git a/‎docs/resources/agentfabric_1.png‎
48.5 KB b/‎docs/resources/agentfabric_1.png‎
48.5 KB
diff --git a/‎docs/resources/agentfabric_2.png‎
52.7 KB b/‎docs/resources/agentfabric_2.png‎
52.7 KB
diff --git a/‎docs/resources/agentfabric_3.png‎
45.1 KB b/‎docs/resources/agentfabric_3.png‎
45.1 KB
diff --git a/‎docs/resources/agentfabric_4.png‎
57.6 KB b/‎docs/resources/agentfabric_4.png‎
57.6 KB
diff --git a/‎docs/source/LLM/Agent微调最佳实践.md‎
Lines changed: 118 additions & 0 deletions b/‎docs/source/LLM/Agent微调最佳实践.md‎
Lines changed: 118 additions & 0 deletions
diff --git a/‎docs/source/LLM/支持的模型和数据集.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/source/LLM/支持的模型和数据集.md‎
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/source_en/LLM/Agent-best-practice.md‎
Lines changed: 115 additions & 0 deletions b/‎docs/source_en/LLM/Agent-best-practice.md‎
Lines changed: 115 additions & 0 deletions
diff --git a/‎docs/source_en/LLM/Supported-models-datasets.md‎
Lines changed: 2 additions & 0 deletions b/‎docs/source_en/LLM/Supported-models-datasets.md‎
Lines changed: 2 additions & 0 deletions
@@ -39,6 +39,7 @@ To facilitate use by users unfamiliar with deep learning, we provide a Gradio we
 Additionally, we are expanding capabilities for other modalities. Currently, we support full-parameter training and LoRA training for AnimateDiff.
 
 ## 🎉 News
+- 2024.04.10: Use SWIFT to fine-tune the qwen-7b-chat model to enhance its function call capabilities, and combine it with [Modelscope-Agent](https://github.com/modelscope/modelscope-agent) for best practices, which can be found [here](https://github.com/modelscope/swift/tree/main/docs/source_en/LLM/Agent-best-practice.md#Usage-with-Modelscope_Agent).
 - 🔥2024.04.09: Support ruozhiba dataset. Search `ruozhiba` in [this documentation](docs/source_en/LLM/Supported-models-datasets.md) to begin training!
 - 2024.04.08: Support the fine-tuning and inference of XVERSE-MoE-A4.2B model, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh) to start training!
 - 2024.04.04: Support **QLoRA+FSDP** to train a 70B model with two 24G memory GPUs, use [this script](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh) to train.
@@ -431,7 +432,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
 | Dataset Type | Training Task  | Documentation                                                                                                                                                                                                                                                                                                        |
 |--------------|:---------------|--------------------------------------------------------------- |
 | General      | Fine-tuning    | 🔥ruozhiba, 🔥ms-bench, 🔥ms-bench-mini, 🔥alpaca-en(gpt4), 🔥alpaca-zh(gpt4), multi-alpaca-all, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, instruct-en, gpt4all-en, sharegpt-en, sharegpt-zh, tulu-v2-sft-mixture, wikipedia-zh, open-orca, open-orca-gpt4, sharegpt-gpt4, 🔥sharegpt-gpt4-mini. |
-| Agent        | Fine-tuning    | 🔥ms-agent, damo-mini-agent-zh, damo-agent-zh, agent-instruct-all-en.                                                                                                                                                                                                                                                |
+| Agent        | Fine-tuning    | 🔥ms-agent, ms-agent-for-agentfabric-default, ms-agent-for-agentfabric-addition, damo-mini-agent-zh, damo-agent-zh, agent-instruct-all-en.                                                                                                                                                                                                                                                |
 | General      | Human Alignment | 🔥hh-rlhf-cn, stack-exchange-paired, hh-rlhf-harmless-base, hh-rlhf-helpful-base, hh-rlhf-helpful-online, hh-rlhf-helpful-rejection-sampled, hh-rlhf-red-team-attempts, hh-rlhf-cn-harmless-base-cn, hh-rlhf-cn-helpful-base-cn, hh-rlhf-cn-harmless-base-en, hh-rlhf-cn-helpful-base-en.                            |
 | Code         | Fine-tuning    | code-alpaca-en, 🔥leetcode-python-en, 🔥codefuse-python-en, 🔥codefuse-evol-instruction-zh.                                                                                                                                                                                                                          |
 | Medical      | Fine-tuning    | medical-en, medical-zh, medical-mini-zh, 🔥disc-med-sft-zh.                                                                                                                                                                                                                                                          |
 
@@ -40,6 +40,7 @@ SWIFT支持近**200种LLM和MLLM**（多模态大模型）的训练、推理、
 此外，我们也在拓展其他模态的能力，目前我们支持了AnimateDiff的全参数训练和LoRA训练。
 
 ## 🎉 新闻
+- 2024.04.10: 使用swift微调qwen-7b-chat模型增强模型function call能力，并结合[Modelscope-Agent](https://github.com/modelscope/modelscope-agent)使用，最佳实践可以查看[这里](https://github.com/modelscope/swift/tree/main/docs/source/LLM/Agent微调最佳实践.md#搭配Modelscope-Agent使用)。
 - 🔥2024.04.09: 支持`弱智吧`系列数据集. 在[支持的模型和数据集文档](docs/source/LLM/支持的模型和数据集.md)中搜索`ruozhiba`来找到数据集并开始训练！
 - 2024.04.08: 支持XVERSE-MoE-A4.2B模型的推理与微调, 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/xverse_moe_a4_2b/lora/sft.sh)来开始训练！
 - 2024.04.04: 支持使用**QLoRA+FSDP**来使用两张24G显卡训练70B模型, 使用[这个脚本](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/llama2_70b_chat/qlora_fsdp/sft.sh)开始训练.
@@ -430,7 +431,7 @@ CUDA_VISIBLE_DEVICES=0 swift deploy \
 | 数据集类型 | 训练任务 | 文档                                                         |
 | ---------- | :------- | ------------------------------------------------------------ |
 | 通用       | 微调     | 🔥ruozhiba, 🔥ms-bench, 🔥ms-bench-mini, 🔥alpaca-en(gpt4), 🔥alpaca-zh(gpt4), multi-alpaca-all, instinwild-en, instinwild-zh, cot-en, cot-zh, firefly-all-zh, instruct-en, gpt4all-en, sharegpt-en, sharegpt-zh, tulu-v2-sft-mixture, wikipedia-zh, open-orca, open-orca-gpt4, sharegpt-gpt4, 🔥sharegpt-gpt4-mini. |
-| Agent      | 微调     | 🔥ms-agent, damo-mini-agent-zh, damo-agent-zh, agent-instruct-all-en. |
+| Agent      | 微调     | 🔥ms-agent, ms-agent-for-agentfabric-default, ms-agent-for-agentfabric-addition, damo-mini-agent-zh, damo-agent-zh, agent-instruct-all-en. |
 | 通用       | 人类对齐 | 🔥hh-rlhf-cn, stack-exchange-paired, hh-rlhf-harmless-base, hh-rlhf-helpful-base, hh-rlhf-helpful-online, hh-rlhf-helpful-rejection-sampled, hh-rlhf-red-team-attempts, hh-rlhf-cn-harmless-base-cn, hh-rlhf-cn-helpful-base-cn, hh-rlhf-cn-harmless-base-en, hh-rlhf-cn-helpful-base-en. |
 | 代码       | 微调     | code-alpaca-en, 🔥leetcode-python-en, 🔥codefuse-python-en, 🔥codefuse-evol-instruction-zh. |
 | 医疗       | 微调     | medical-en, medical-zh, medical-mini-zh, 🔥disc-med-sft-zh.   |
 
@@ -11,6 +11,7 @@ SWIFT支持了开源模型，尤其是中小型模型（7B、14B等）对Agent
 - [微调](#微调)
 - [推理](#推理)
 - [总结](#总结)
+- [搭配Modelscope-Agent使用](#搭配Modelscope-Agent使用)
 
 ## 环境安装
 
@@ -429,6 +430,123 @@ print()
 
 
 
+## 搭配Modelscope-Agent使用
+结合[Modelscope-Agent](https://github.com/modelscope/modelscope-agent)，微调模型用于搭建Agent
+
+本节针对Modelscope-Agent中的交互式框架AgentFabric，微调小模型qwen-7b-chat使其具有function call能力
+
+由于ms-agent中的system prompt与Modelscope-Agent中的system prompt格式不匹配，直接训练效果不佳，为此我们根据ms-agent转换格式得到新数据集[ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)，现已集成到SWIFT中。
+其中`ms-agent-for-agentfabric-default`包含3万条由ms-agent转换的数据集，`ms-agent-for-agentfabric-additional`包含488条由开源的AgentFabric框架实际调用访问数据筛选得到
+
+
+### 微调
+将`dataset`换为`ms-agent-for-agentfabric`和`ms-agent-for-agentfabric-default`
+```shell
+# Experimental environment: 8GPU
+nproc_per_node=8
+
+PYTHONPATH=../../.. \
+torchrun \
+    --nproc_per_node=$nproc_per_node \
+    --master_port 29500 \
+    llm_sft.py \
+    --model_id_or_path qwen/Qwen-7B-Chat \
+    --model_revision master \
+    --sft_type lora \
+    --tuner_backend swift \
+    --dtype AUTO \
+    --output_dir output \
+    --dataset ms-agent-for-agentfabric-default ms-agent-for-agentfabric-addition \
+    --train_dataset_mix_ratio 2.0 \
+    --train_dataset_sample -1 \
+    --num_train_epochs 2 \
+    --max_length 1500 \
+    --check_dataset_strategy warning \
+    --lora_rank 8 \
+    --lora_alpha 32 \
+    --lora_dropout_p 0.05 \
+    --lora_target_modules ALL \
+    --self_cognition_sample 3000 \
+    --model_name 卡卡罗特 \
+    --model_author 陶白白 \
+    --gradient_checkpointing true \
+    --batch_size 2 \
+    --weight_decay 0.1 \
+    --learning_rate 5e-5 \
+    --gradient_accumulation_steps $(expr 32 / $nproc_per_node) \
+    --max_grad_norm 0.5 \
+    --warmup_ratio 0.03 \
+    --eval_steps 100 \
+    --save_steps 100 \
+    --save_total_limit 2 \
+    --logging_steps 10
+```
+
+merge lora
+```
+CUDA_VISIBLE_DEVICES=0 swift export \
+    --ckpt_dir '/path/to/qwen-7b-chat/vx-xxx/checkpoint-xxx' --merge_lora true
+```
+
+### AgentFabric
+#### 环境安装
+```bash
+git clone https://github.com/modelscope/modelscope-agent.git
+cd modelscope-agent  && pip install -r requirements.txt && pip install -r apps/agentfabric/requirements.txt
+```
+
+#### 部署模型
+使用以下任意一种方式部署模型
+##### swift deploy
+```bash
+CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir /path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged
+```
+
+##### vllm
+```bash
+python -m vllm.entrypoints.openai.api_server --model /path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged --trust-remote-code
+```
+
+#### 添加本地模型配置
+在`/path/to/modelscope-agent/apps/agentfabric/config/model_config.json`中，新增合并后的本地模型
+```
+    "my-qwen-7b-chat": {
+        "type": "openai",
+        "model": "/path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged",
+        "api_base": "http://localhost:8000/v1",
+        "is_chat": true,
+        "is_function_call": false,
+        "support_stream": false
+    }
+```
+注意，如果使用`swift deploy`部署，需要将`"model"`的值设为`qwen-7b-chat`
+
+#### 启动AgentFabric
+在以下实践中，会调用[Wanx Image Generation](https://help.aliyun.com/zh/dashscope/opening-service?spm=a2c4g.11186623.0.0.50724937O7n40B)和[高德天气](https://lbs.amap.com/api/webservice/guide/create-project/get-key),需要手动设置API KEY, 设置后启动AgentFabric
+```bash
+export PYTHONPATH=$PYTHONPATH:/path/to/your/modelscope-agent
+export DASHSCOPE_API_KEY=your_api_key
+export AMAP_TOKEN=your_api_key
+cd modelscope-agent/apps/agentfabric
+python app.py
+```
+
+进入AgentFabric后，在配置(Configure)的模型中选择本地模型`my-qwen-7b-chat`
+
+内置能力选择agent可以调用的API, 这里选择`Wanx Image Generation`和`高德天气`
+
+点击更新配置，等待配置完成后在右侧的输入栏中与Agent交互
+> 天气查询
+![agentfabric_1](../../resources/agentfabric_1.png)
+![agentfabric_2](../../resources/agentfabric_2.png)
+
+> 文生图
+![agentfabric_3](../../resources/agentfabric_3.png)
+![agentfabric_4](../../resources/agentfabric_4.png)
+
+可以看到微调后的模型可以正确理解指令并调用工具
+
+
 ## 总结
 
 通过SWIFT支持的Agent训练能力，我们使用ms-agent和ms-bench对qwen-7b-chat模型进行了微调。可以看到微调后模型保留了通用知识问答能力，并在system字段增加了API的情况下可以正确调用并完成任务。需要注意的是：
 
@@ -243,6 +243,8 @@
 |sharegpt-gpt4|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|103063|0|1286.2±2089.4, min=22, max=221080|chat, multilingual, general, multi-round|
 |🔥sharegpt-gpt4-mini|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|6205|0|3511.6±6068.5, min=33, max=116018|chat, multilingual, general, multi-round, gpt4|
 |🔥ms-agent|[iic/ms_agent](https://modelscope.cn/datasets/iic/ms_agent/summary)|30000|0|647.7±217.1, min=199, max=2722|chat, agent, multi-round|
+|ms-agent-for-agentfabric-default|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|30000|0|617.8±199.1, min=251, max=2657|chat, agent, multi-round|
+|ms-agent-for-agentfabric-addition|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|488|0|2084.9±1514.8, min=489, max=7354|chat, agent, multi-round|
 |damo-agent-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|422115|161|965.7±440.9, min=321, max=31535|chat, agent, multi-round|
 |damo-agent-mini-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|39964|152|1230.9±350.1, min=558, max=4982|chat, agent, multi-round|
 |agent-instruct-all-en|[huangjintao/AgentInstruct_copy](https://modelscope.cn/datasets/huangjintao/AgentInstruct_copy/summary)|1866|0|1144.3±635.5, min=206, max=6412|chat, agent, multi-round|
 
@@ -10,6 +10,7 @@ SWIFT supports open-source models, especially small and medium-sized models (7B,
 - [Data Preparation](#Data-Preparation)
 - [Fine-tuning](#Fine-tuning)
 - [Inference](#Inference)
+- [Usage with Modelscope-Agent](#Usage-with-Modelscope_Agent)
 - [Summary](#Summary)
 
 ## Environment Setup
@@ -421,7 +422,121 @@ print()
 # response:
 # Final Answer: There is fire in the image at coordinates [101.1, 200.9]
 ```
+## Usage-with-Modelscope_Agent
+In conjunction with Modelscope-Agent(https://github.com/modelscope/modelscope-agent), fine-tune models for building Agents.
 
+This section focuses on the interactive framework AgentFabric within Modelscope-Agent to fine-tune the small model qwen-7b-chat to enable function call capabilities.
+
+Due to the mismatch between the system prompt in ms-agent and that in Modelscope-Agent, direct training yields suboptimal results. To address this, we have created a new dataset [ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary) by converting the format from ms-agent, which is now integrated into SWIFT. The `ms-agent-for-agentfabric-default` includes 30,000 entries converted from ms-agent data, while `ms-agent-for-agentfabric-additional` contains 488 entries filtered from actual function call access data by the open-source AgentFabric framework.
+
+### Fine-tuning
+Replace `dataset` with `ms-agent-for-agentfabric` and `ms-agent-for-agentfabric-default`:
+```shell
+# Experimental environment: 8GPU
+nproc_per_node=8
+
+PYTHONPATH=../../.. \
+torchrun \
+    --nproc_per_node=$nproc_per_node \
+    --master_port 29500 \
+    llm_sft.py \
+    --model_id_or_path qwen/Qwen-7B-Chat \
+    --model_revision master \
+    --sft_type lora \
+    --tuner_backend swift \
+    --dtype AUTO \
+    --output_dir output \
+    --dataset ms-agent-for-agentfabric-default ms-agent-for-agentfabric-addition \
+    --train_dataset_mix_ratio 2.0 \
+    --train_dataset_sample -1 \
+    --num_train_epochs 2 \
+    --max_length 1500 \
+    --check_dataset_strategy warning \
+    --lora_rank 8 \
+    --lora_alpha 32 \
+    --lora_dropout_p 0.05 \
+    --lora_target_modules ALL \
+    --self_cognition_sample 3000 \
+    --model_name 卡卡罗特 \
+    --model_author 陶白白 \
+    --gradient_checkpointing true \
+    --batch_size 2 \
+    --weight_decay 0.1 \
+    --learning_rate 5e-5 \
+    --gradient_accumulation_steps $(expr 32 / $nproc_per_node) \
+    --max_grad_norm 0.5 \
+    --warmup_ratio 0.03 \
+    --eval_steps 100 \
+    --save_steps 100 \
+    --save_total_limit 2 \
+    --logging_steps 10
+```
+
+merge lora
+```
+CUDA_VISIBLE_DEVICES=0 swift export \
+    --ckpt_dir '/path/to/qwen-7b-chat/vx-xxx/checkpoint-xxx' --merge_lora true
+```
+### AgentFabric
+#### Environment Setup:
+```bash
+git clone https://github.com/modelscope/modelscope-agent.git
+cd modelscope-agent  && pip install -r requirements.txt && pip install -r apps/agentfabric/requirements.txt
+```
+
+#### Model Deployment
+Launch vllm service:
+Use any of the following methods to deploy the model.
+
+##### swift deploy
+```bash
+CUDA_VISIBLE_DEVICES=0 swift deploy --ckpt_dir /path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged
+```
+
+##### vllm
+```bash
+python -m vllm.entrypoints.openai.api_server --model /path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged --trust-remote-code
+```
+
+#### Adding Local Model Configuration
+
+In /path/to/modelscope-agent/apps/agentfabric/config/model_config.json, add the merged local model:
+```
+    "my-qwen-7b-chat": {
+        "type": "openai",
+        "model": "/path/to/qwen-7b-chat/vx-xxx/checkpoint-xxxx-merged",
+        "api_base": "http://localhost:8000/v1",
+        "is_chat": true,
+        "is_function_call": false,
+        "support_stream": false
+    }
+```
+Note that if deploying with `swift deploy`, the value of `model` should be set to `qwen-7b-chat`.
+
+#### Launching AgentFabric
+In the following practice, [Wanx Image Generation](https://help.aliyun.com/zh/dashscope/opening-service?spm=a2c4g.11186623.0.0.50724937O7n40B) and [Amap Weather]((https://lbs.amap.com/api/webservice/guide/create-project/get-key)) will be called, requiring manual setting of API KEY. After setting, start AgentFabric:
+```bash
+export PYTHONPATH=$PYTHONPATH:/path/to/your/modelscope-agent
+export DASHSCOPE_API_KEY=your_api_key
+export AMAP_TOKEN=your_api_key
+cd modelscope-agent/apps/agentfabric
+python app.py
+```
+After entering Agentfabric, select the local model my-qwen-7b-chat in the Configured models.
+
+Choose the APIs that the agent can call, select Wanx Image Generation and Amap Weather here.
+
+Click Update Configuration, wait for the configuration to complete, and interact with the Agent in the input box on the right.
+
+> Weather Inquiry
+![agentfabric_1](../../resources/agentfabric_1.png)
+![agentfabric_2](../../resources/agentfabric_2.png)
+
+> text2image
+![agentfabric_3](../../resources/agentfabric_3.png)
+![agentfabric_4](../../resources/agentfabric_4.png)
+
+It can be seen that the fine-tuned model can correctly understand instructions and call tools.
 ## Summary
 
 Through the Agent training capability supported by SWIFT, we fine-tuned the qwen-7b-chat model using ms-agent and ms-bench. It can be seen that after fine-tuning, the model retains the general knowledge question-answering ability, and when the system field is added with APIs, it can correctly call and complete tasks. It should be noted that:
 
@@ -228,6 +228,8 @@ The table below introduces the datasets supported by SWIFT:
 |sharegpt-gpt4|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|103063|0|1286.2±2089.4, min=22, max=221080|chat, multilingual, general, multi-round|
 |🔥sharegpt-gpt4-mini|[AI-ModelScope/sharegpt_gpt4](https://modelscope.cn/datasets/AI-ModelScope/sharegpt_gpt4/summary)|6205|0|3511.6±6068.5, min=33, max=116018|chat, multilingual, general, multi-round, gpt4|
 |🔥ms-agent|[iic/ms_agent](https://modelscope.cn/datasets/iic/ms_agent/summary)|30000|0|647.7±217.1, min=199, max=2722|chat, agent, multi-round|
+|ms-agent-for-agentfabric-default|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|30000|0|617.8±199.1, min=251, max=2657|chat, agent, multi-round|
+|ms-agent-for-agentfabric-addition|[AI-ModelScope/ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary)|488|0|2084.9±1514.8, min=489, max=7354|chat, agent, multi-round|
 |damo-agent-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|422115|161|965.7±440.9, min=321, max=31535|chat, agent, multi-round|
 |damo-agent-mini-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|39964|152|1230.9±350.1, min=558, max=4982|chat, agent, multi-round|
 |agent-instruct-all-en|[huangjintao/AgentInstruct_copy](https://modelscope.cn/datasets/huangjintao/AgentInstruct_copy/summary)|1866|0|1144.3±635.5, min=206, max=6412|chat, agent, multi-round|