Skip to content

Conversation

DNXie
Copy link
Member

@DNXie DNXie commented Aug 22, 2025

  • Add Reward abstract class
  • Add MathReward class
  • Add unit tests for MathReward
  • Add ThinkingReward class
  • Add unit tests for ThinkingReward

Test:

python tests/unit_tests/data/test_math_reward.py

..........................
----------------------------------------------------------------------
Ran 26 tests in 0.002s

OK
python tests/unit_tests/rl/test_thinking_reward.py
................
----------------------------------------------------------------------
Ran 16 tests in 0.000s

OK

@DNXie DNXie requested a review from pbontrager August 22, 2025 00:58
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 22, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pbontrager If multiple numbers appear, take the last one. Does this design sound reasonable to you?

@DNXie DNXie requested a review from joecummings August 22, 2025 01:20
@DNXie DNXie changed the title Add math reward Add math/thinking reward Aug 22, 2025
Copy link
Contributor

@pbontrager pbontrager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for now we should just have a rewards.py file instead of a whole folder so it's easier to scan through. Otherwise looks good!

@DNXie DNXie merged commit db60329 into meta-pytorch:main Aug 25, 2025
4 checks passed
@DNXie
Copy link
Member Author

DNXie commented Aug 25, 2025

I think for now we should just have a rewards.py file instead of a whole folder so it's easier to scan through. Otherwise looks good!

Solved in #68

@DNXie DNXie deleted the add_math_reward branch September 10, 2025 19:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants