Skip to content

Commit 6186f9f

Browse files
committed
Bump LanguageReward match_reward to 2.0
Increased match_reward from default 1.0 to 2.0 to give more weight to language matching in the multi-objective reward.
1 parent f72be7f commit 6186f9f

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

sandbox/grpo_language/main.py

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -329,7 +329,11 @@ async def main(cfg: DictConfig):
329329
MathReward(),
330330
ThinkingReward(tag="思考"), # Use Japanese tag
331331
LanguageReward(
332-
target_language="ja", tag="思考", debug=True, debug_sample_rate=0.1
332+
target_language="ja",
333+
tag="思考",
334+
match_reward=2.0,
335+
debug=True,
336+
debug_sample_rate=0.1,
333337
), # Japanese language reward with debug
334338
]
335339
),

0 commit comments

Comments
 (0)