Skip to content

[bug-metrcis] "critic/advantages/mean" 指标标量统计非常小 #4747

@Caleb66666

Description

@Caleb66666

System Info

Image

版本说明:
20251230代表跟踪repo-main的时间版本
latest代表跟踪repo-main-20251224时间版本
复杂字母代表正常运行的老版本

repo-main-20251224存在两个问题:pg_loss和adv/mean接近0
repo-main-20251230,解决了一个问题:pg_loss不再被过度缩放

但adv/mean还是存在被过度缩小的情况,是否类似于:#4711

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

复现步骤:

  1. 参考verl/examples/ppo_trainer/run_qwen2.5-3b_rm_reward_loop_colocate.sh

Expected behavior

最新版本跑出来的结果显示上adv/mean接近0

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions