Skip to content

Conversation

@XiaoBoAI
Copy link
Collaborator

@XiaoBoAI XiaoBoAI commented Jan 7, 2026

  • Refactor reward_fn.py in pairwise/pointwise: convert comments to English, unify code style (double quotes, formatting), remove unused imports
  • Refactor chat_rl_dataset.py: improve code quality and formatting
  • Add report_generator.py for zero-shot evaluation pipeline

- Refactor reward_fn.py in pairwise/pointwise: convert comments to English,
  unify code style (double quotes, formatting), remove unused imports
- Refactor chat_rl_dataset.py: improve code quality and formatting
- Add report_generator.py for zero-shot evaluation pipeline
Copy link
Collaborator

@ployts ployts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ployts ployts merged commit f21476f into main Jan 7, 2026
2 checks passed
@XiaoBoAI XiaoBoAI deleted the refactor/grpo-reward-and-report-generator branch January 7, 2026 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants