Skip to content

Conversation

@garyzhang99
Copy link
Collaborator

Description

Implemented and tested the SAPO algorithm. The algorithm details can refer to the SAPO paper.

We confirm that SAPO achieves notable gains compared to pure GRPO:

image

Checklist

Please check the following items before code is ready to be reviewed.

  • Code has passed all tests
  • Docstrings have been added/updated in Google Style
  • Documentation has been updated
  • Code is ready for review

@garyzhang99
Copy link
Collaborator Author

/unittest-module-algorithm

@github-actions
Copy link

github-actions bot commented Dec 2, 2025

Summary

Tests 📝 Passed ✅ Failed ❌ Skipped ⏭️ Other ❓ Flaky 🍂 Duration ⏱️
15 15 0 0 0 0 18.0s

Tests

Test Name Status Flaky Duration
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_std_grpo 41ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_batch_level_step_wise_grpo_advantage 2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_duplicate_grpo 5ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_advantage 3ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_correct_bias 2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_grpo_reward_std 1ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_advantage 2ms
tests/algorithm/advantage_fn_test.py::TestGroupedAdvantageFn::test_step_wise_grpo_with_std_threshold 2ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_dpo_policy_loss 2ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_gspo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_mix_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_opmd_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_ppo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sapo_policy_loss 1ms
tests/algorithm/policy_loss_test.py::VerlPolicyLossTest::test_sft_policy_loss 1ms

Github Test Reporter by CTRF 💚

@pan-x-c pan-x-c merged commit b2fd301 into modelscope:main Dec 3, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants