yanxi-chen
diff --git a/‎README.md‎
Lines changed: 1 addition & 1 deletion b/‎README.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎README_zh.md‎
Lines changed: 1 addition & 1 deletion b/‎README_zh.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/sphinx_doc/assets/grpo_rubric_reward.png‎
367 KB b/‎docs/sphinx_doc/assets/grpo_rubric_reward.png‎
367 KB
diff --git a/‎docs/sphinx_doc/source/tutorial/example_search_email.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/sphinx_doc/source/tutorial/example_search_email.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/sphinx_doc/source/tutorial/faq.md‎
Lines changed: 10 additions & 0 deletions b/‎docs/sphinx_doc/source/tutorial/faq.md‎
Lines changed: 10 additions & 0 deletions
diff --git a/‎docs/sphinx_doc/source/tutorial/trinity_configs.md‎
Lines changed: 7 additions & 1 deletion b/‎docs/sphinx_doc/source/tutorial/trinity_configs.md‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎docs/sphinx_doc/source_zh/tutorial/example_search_email.md‎
Lines changed: 1 addition & 1 deletion b/‎docs/sphinx_doc/source_zh/tutorial/example_search_email.md‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/sphinx_doc/source_zh/tutorial/faq.md‎
Lines changed: 12 additions & 0 deletions b/‎docs/sphinx_doc/source_zh/tutorial/faq.md‎
Lines changed: 12 additions & 0 deletions
diff --git a/‎docs/sphinx_doc/source_zh/tutorial/trinity_configs.md‎
Lines changed: 7 additions & 1 deletion b/‎docs/sphinx_doc/source_zh/tutorial/trinity_configs.md‎
Lines changed: 7 additions & 1 deletion
diff --git a/‎examples/grpo_email_search/README.md‎
Lines changed: 1 addition & 1 deletion b/‎examples/grpo_email_search/README.md‎
Lines changed: 1 addition & 1 deletion
@@ -1,4 +1,4 @@
-[**中文主页**](https://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md) | [**Tutorial**](https://modelscope.github.io/Trinity-RFT/) | [**FAQ**](./docs/sphinx_doc/source/tutorial/faq.md)
+[**中文主页**](https://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md) | [**Tutorial**](https://modelscope.github.io/Trinity-RFT/) | [**FAQ**](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/faq.html)
 
 <div align="center">
   <img src="https://img.alicdn.com/imgextra/i1/O1CN01lvLpfw25Pl4ohGZnU_!!6000000007519-2-tps-1628-490.png" alt="Trinity-RFT" style="height: 120px;">
 
@@ -1,4 +1,4 @@
-[**English Homepage**](https://github.com/modelscope/Trinity-RFT/blob/main/README.md) | [**中文文档**](https://modelscope.github.io/Trinity-RFT/zh/) | [**常见问题**](./docs/sphinx_doc/source/zh/tutorial/faq.md)
+[**English Homepage**](https://github.com/modelscope/Trinity-RFT/blob/main/README.md) | [**中文文档**](https://modelscope.github.io/Trinity-RFT/zh/) | [**常见问题**](https://modelscope.github.io/Trinity-RFT/zh/main/tutorial/faq.html)
 
 <div align="center">
   <img src="https://img.alicdn.com/imgextra/i1/O1CN01lvLpfw25Pl4ohGZnU_!!6000000007519-2-tps-1628-490.png" alt="Trinity-RFT" style="height: 120px;">
 
@@ -1,7 +1,7 @@
 # Email Search Workflow
 
 
-This example shows a multi-turn email search workflow, inspired by [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904). We implement a ReAct Agent and define tools for email search. Note that this example rewquires installing `AgentScope==0.1.6`.
+This example shows a multi-turn email search workflow, inspired by [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904). We implement a ReAct Agent and define tools for email search. Note that this example requires installing [AgentScope](https://github.com/agentscope-ai/agentscope?tab=readme-ov-file#-installation).
 
 ## Core Components
 
 
@@ -107,6 +107,16 @@ export RAY_DEDUP_LOGS=0
 trinity run --config grpo_gsm8k/gsm8k.yaml 2>&1 | tee debug.log
 ```
 
+### Debugging the Workflow
+
+To debug a new workflow, use Trinity-RFT's debug mode with the following steps:
+
+1. Launch the inference model via `trinity debug --config <config_file_path> --module inference_model`
+
+2. Debug the workflow in another terminal via `trinity debug --config <config_file_path> --module workflow --output_file <output_file_path> --plugin_dir <plugin_dir>`
+
+Please refer to {ref}`Workflow Development Guide <Workflows>` section for details.
+
 
 ## Part 4: Other Questions
 **Q:** What's the purpose of `buffer.trainer_input.experience_buffer.path`?
 
@@ -160,7 +160,7 @@ model:
 
 - `model_path`: Path to the model being trained.
 - `critic_model_path`: Optional path to a separate critic model. If empty, defaults to `model_path`.
-- `max_model_len`: Maximum number of tokens in a sequence. It is recommended to set this value manually. If not set, it will be inferred from the model configuration.
+- `max_model_len`: Maximum number of tokens in a sequence. It is recommended to set this value manually. If not specified, the system will attempt to set it to `max_prompt_tokens` + `max_response_tokens`. However, this requires both values to be already set; otherwise, an error will be raised.
 - `max_response_tokens`: Maximum number of tokens allowed in generated responses. Only for `chat` and `generate` methods in `InferenceModel`.
 - `max_prompt_tokens`: Maximum number of tokens allowed in prompts. Only for `chat` and `generate` methods in `InferenceModel`.
 - `min_response_tokens`: Minimum number of tokens allowed in generated responses. Only for `chat` and `generate` methods in `InferenceModel`. Default is `1`. It must be less than `max_response_tokens`.
@@ -405,6 +405,7 @@ trainer:
   trainer_type: 'verl'
   save_interval: 100
   total_steps: 1000
+  save_strategy: "unrestricted"
   trainer_config: null
   trainer_config_path: ''
 ```
@@ -413,6 +414,11 @@ trainer:
 - `trainer_type`: Trainer backend implementation. Currently only supports `verl`.
 - `save_interval`: Frequency (in steps) at which to save model checkpoints.
 - `total_steps`: Total number of training steps.
+- `save_strategy`: The parallel strategy used when saving the model. Defaults to `unrestricted`. The available options are as follows:
+  - `single_thread`: Only one thread across the entire system is allowed to save the model; saving tasks from different threads are executed sequentially.
+  - `single_process`: Only one process across the entire system is allowed to perform saving; multiple threads within that process can handle saving tasks in parallel, while saving operations across different processes are executed sequentially.
+  - `single_node`: Only one compute node across the entire system is allowed to perform saving; processes and threads within that node can work in parallel, while saving operations across different nodes are executed sequentially.
+  - `unrestricted`: No restrictions on saving operations; multiple nodes, processes, or threads are allowed to save the model simultaneously.
 - `trainer_config`: The trainer configuration provided inline.
 - `trainer_config_path`: The path to the trainer configuration file. Only one of `trainer_config_path` and `trainer_config` should be specified.
 
 
@@ -1,6 +1,6 @@
 # 邮件搜索例子
 
-这个示例展示了一个多轮邮件搜索工作流，内容参考自 [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904)。我们实现了一个 ReAct Agent，并定义了用于邮件搜索的工具。注意：此示例需要安装 `AgentScope==0.1.6`。
+这个示例展示了一个多轮邮件搜索工作流，内容参考自 [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904)。我们实现了一个 ReAct Agent，并定义了用于邮件搜索的工具。注意：此示例需要安装 [AgentScope](https://github.com/agentscope-ai/agentscope?tab=readme-ov-file#-installation)。
 
 ## 核心组件
 
 
@@ -106,6 +106,18 @@ export RAY_DEDUP_LOGS=0
 trinity run --config grpo_gsm8k/gsm8k.yaml 2>&1 | tee debug.log
 ```
 
+### 调试工作流（Workflow）
+
+
+实现新工作流后，可使用 Trinity-RFT 的调试模式进行调试，步骤如下：
+
+1. 启动推理模型： `trinity debug --config <config_file_path> --module inference_model`
+
+2. 在另一个终端中进行工作流的调试：`trinity debug --config <config_file_path> --module workflow --output_file <output_file_path> --plugin_dir <plugin_dir>`
+
+更多详细信息，请参阅{ref}`工作流开发指南 <Workflows>`章节。
+
+
 ## 第四部分：其他问题
 **Q:** `buffer.trainer_input.experience_buffer.path` 的作用是什么？
 
 
@@ -160,7 +160,7 @@ model:
 
 - `model_path`: 被训练模型的路径。
 - `critic_model_path`: 可选的独立 critic 模型路径。若为空，则默认为 `model_path`。
-- `max_model_len`: 该模型所支持的单个序列最大 token 数。
+- `max_model_len`: 表示模型所支持的单个序列最大 token 数。如未指定，系统会尝试将其设为 `max_prompt_tokens` + `max_response_tokens`。但前提是这两个值都必须已设置，否则将引发错误。
 - `max_prompt_tokens`: 输入 prompt 中允许的最大 token 数。仅对 `InferenceModel` 中的 `chat` 和 `generate` 方法生效。
 - `max_response_tokens`: 模型生成的回复中允许的最大 token 数。仅对 `InferenceModel` 中的 `chat` 和 `generate` 方法生效。
 - `min_response_tokens`: 模型生成的回复中允许的最小 token 数。仅对 `InferenceModel` 中的 `chat` 和 `generate` 方法生效。
@@ -405,6 +405,7 @@ trainer:
   trainer_type: 'verl'
   save_interval: 100
   total_steps: 1000
+  save_strategy: "unrestricted"
   trainer_config: null
   trainer_config_path: ''
 ```
@@ -413,6 +414,11 @@ trainer:
 - `trainer_type`: trainer 后端实现。目前仅支持 `verl`。
 - `save_interval`: 保存模型检查点的频率（步）。
 - `total_steps`: 总训练步数。
+- `save_strategy`: 模型保存时的并行策略。默认值为`unrestricted`。可选值如下：
+  - `single_thread`：整个系统中，仅允许一个线程进行模型保存，不同保存线程之间串行执行。
+  - `single_process`：整个系统中，仅允许一个进程执行保存，该进程内的多个线程可以并行处理保存任务，不同进程之间串行执行。
+  - `single_node`：整个系统中，仅允许一个计算节点执行保存，该节点内的进程和线程可并行工作，不同节点的保存串行执行。
+  - `unrestricted`：不限制保存操作，允许多个节点、进程或线程同时保存模型。
 - `trainer_config`: 内联提供的 trainer 配置。
 - `trainer_config_path`: trainer 配置文件的路径。`trainer_config_path` 和 `trainer_config` 只能指定其一。
 
 
@@ -1,6 +1,6 @@
 # Email Search Workflow
 
-This example shows a multi-turn email search workflow, inspired by [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904). We implement a ReAct Agent and define tools for email search. Note that this example rewquires installing `AgentScope==0.1.6`.
+This example shows a multi-turn email search workflow, inspired by [ART](https://openpipe.ai/blog/art-e-mail-agent?refresh=1756431423904). We implement a ReAct Agent and define tools for email search. Note that this example requires installing [AgentScope](https://github.com/agentscope-ai/agentscope?tab=readme-ov-file#-installation).
 
 For more detailed information, please refer to the [documentation](../../docs/sphinx_doc/source/tutorial/example_search_email.md).
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-[中文主页](https://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md) \| [Tutorial](https://modelscope.github.io/Trinity-RFT/) \| [FAQ](./docs/sphinx_doc/source/tutorial/faq.md)`
	`1`	`+[中文主页](https://github.com/modelscope/Trinity-RFT/blob/main/README_zh.md) \| [Tutorial](https://modelscope.github.io/Trinity-RFT/) \| [FAQ](https://modelscope.github.io/Trinity-RFT/en/main/tutorial/faq.html)`
`2`	`2`
`3`	`3`	`<div align="center">`
`4`	`4`	`<img src="https://img.alicdn.com/imgextra/i1/O1CN01lvLpfw25Pl4ohGZnU_!!6000000007519-2-tps-1628-490.png" alt="Trinity-RFT" style="height: 120px;">`
Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,4 @@`
`1`		`-[English Homepage](https://github.com/modelscope/Trinity-RFT/blob/main/README.md) \| [中文文档](https://modelscope.github.io/Trinity-RFT/zh/) \| [常见问题](./docs/sphinx_doc/source/zh/tutorial/faq.md)`
	`1`	`+[English Homepage](https://github.com/modelscope/Trinity-RFT/blob/main/README.md) \| [中文文档](https://modelscope.github.io/Trinity-RFT/zh/) \| [常见问题](https://modelscope.github.io/Trinity-RFT/zh/main/tutorial/faq.html)`
`2`	`2`
`3`	`3`	`<div align="center">`
`4`	`4`	`<img src="https://img.alicdn.com/imgextra/i1/O1CN01lvLpfw25Pl4ohGZnU_!!6000000007519-2-tps-1628-490.png" alt="Trinity-RFT" style="height: 120px;">`