Refactor Algorithm-Related Code for Better Maintainability and Extensibility

In order to improve testability and maintenance, and also to make it easier to add new algorithms in the future, we propose a comprehensive refactoring of the algorithm-related code. The detailed plan is as follows:

1. **Create an `algorithm` module** to centralize the implementation of various algorithms. This includes:
   - `advantage_fn`
   - `policy_loss_fn`
   - `kl_loss_fn`
   - `entropy_loss_fn`
   - `read_strategy`

2. **Extend `AlgorithmConfig`** so that the `Trainer` can select and configure the appropriate algorithm implementations based on configuration.

3. **Remove SFT/DPO/RFT-specific logic** from the current `Trainer`, and replace it with a unified `train_step` abstraction.

4. **Add documentation** to the Developer Guide explaining how to implement and integrate new algorithms.

This refactoring will improve code clarity, reduce coupling, and streamline the process of adding and testing new algorithms.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor Algorithm-Related Code for Better Maintainability and Extensibility #59

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Refactor Algorithm-Related Code for Better Maintainability and Extensibility #59

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions