Is the newly implemented GRPO supposed to be slower than PPO? #2847

yxchng · 2025-01-23T02:03:25Z

yxchng
Jan 23, 2025

it seems to be much slower on my side with G=8? though with less memory (which is expected)

qgallouedec · 2025-01-23T12:45:34Z

qgallouedec
Jan 23, 2025
Maintainer

Yes, generation is the main bottleneck here, as for each prompt, you need to generate several completions (8 by default). We're currently pushing hard to make the generation way faster, see #2600

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the newly implemented GRPO supposed to be slower than PPO? #2847

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Is the newly implemented GRPO supposed to be slower than PPO? #2847

Uh oh!

yxchng Jan 23, 2025

Replies: 1 comment

Uh oh!

qgallouedec Jan 23, 2025 Maintainer

yxchng
Jan 23, 2025

qgallouedec
Jan 23, 2025
Maintainer