Skip to content

Commit 1678575

Browse files
committed
update
1 parent eecd0ff commit 1678575

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

_posts/2025-04-18-openrlhf-vllm.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,9 @@ As the demand for training reasoning large language models (LLMs) grows, Reinfor
1111

1212
## Design Philosophy
1313

14-
To address these challenges, OpenRLHF is designed as a user-friendly, high-performance framework for Reinforcement Learning from Human Feedback (RLHF), integrating key technologies such as Ray, vLLM, Zero Redundancy Optimizer (ZeRO-3), and Automatic Tensor Parallelism (AutoTP):
14+
To address these challenges, [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF) is designed as a user-friendly, high-performance framework for Reinforcement Learning from Human Feedback (RLHF), integrating key technologies such as Ray, vLLM, Zero Redundancy Optimizer (ZeRO-3), and Automatic Tensor Parallelism (AutoTP):
1515

16-
**Ray** serves as the backbone for distributed programming within OpenRLHF. Its robust scheduling and orchestration capabilities make it ideal for managing the complex data flows and computations inherent in RLHF training, including the distribution of reward models across multiple nodes.
16+
**Ray** serves as the backbone for distributed programming within OpenRLHF. Its robust scheduling and orchestration capabilities make it ideal for managing the complex data flows and computations inherent in RLHF training, including the distribution of rule-based reward models across multiple nodes.
1717

1818
**vLLM with Ray Executor and AutoTP** is central to accelerating inference within OpenRLHF. It naturally supports Ray Executors and integrates with Hugging Face Transformers, enabling efficient weight updates through AutoTP. This combination ensures high-throughput, memory-efficient generation of large language models.
1919

0 commit comments

Comments
 (0)