Skip to content

Commit b5f6e9c

Browse files
committed
add curve for async
1 parent cf7cbad commit b5f6e9c

File tree

3 files changed

+3
-16
lines changed

3 files changed

+3
-16
lines changed
59.4 KB
Loading

docs/sphinx_doc/source/tutorial/example_async_mode.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,4 +35,5 @@ bash examples/async_gsm8k/run.sh
3535
```
3636

3737
In the following, we show the results of asynchronous mode in the following.
38-
# TODO
38+
39+
![async](../../assets/async-curve.png)

docs/sphinx_doc/source/tutorial/example_reasoning_advanced.md

Lines changed: 1 addition & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Example: off-policy / asynchronous RFT mode
1+
# Example: off-policy RFT mode
22

33

44
Let's continue with the [previous GSM8k example](./example_reasoning_basic.md) and show some advanced features provided by Trinity-RFT, namely, off-policy or asynchronous RFT mode.
@@ -35,17 +35,3 @@ A similar performance boost is shown at step 21, which leads to a converged scor
3535

3636

3737
![opmd](../../assets/opmd-curve.png)
38-
39-
40-
41-
42-
43-
## Asynchronous mode
44-
45-
46-
Trinity-RFT supports the asynchronous and decoupled mode of RFT, where explorer and trainer act independently and asynchronously.
47-
To run this mode, the explorer and trainer need to be launched separately, with the `mode` parameter in the config file set to `explore` and `train` respectively.
48-
49-
50-
51-
*We are still testing this mode more thoroughly. A concrete example is coming soon!*

0 commit comments

Comments
 (0)