Skip to content

Commit ee9a5b2

Browse files
authored
fix(algo): fix bugs for preprocessing reasoning adv (RLinf#335)
* feat: fix bugs --------- Signed-off-by: Florielle <1205402283@qq.com>
1 parent 4d9d1f4 commit ee9a5b2

File tree

1 file changed

+1
-3
lines changed

1 file changed

+1
-3
lines changed

rlinf/algorithms/utils.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -197,9 +197,7 @@ def preprocess_reasoning_advantages_inputs(
197197
kwargs.update({"rewards": expanded_rewards})
198198

199199
elif kwargs["adv_type"] == "grpo":
200-
grouped_rewards = (
201-
rewards.reshape(-1, kwargs["group_size"]).transpose(0, 1).contiguous()
202-
)
200+
grouped_rewards = rewards.reshape(-1, kwargs["group_size"]).contiguous()
203201
kwargs.update(
204202
{
205203
"rewards": grouped_rewards,

0 commit comments

Comments
 (0)