Add example: Paper Reproduction, Replacing the Energy-Based Reward Model (EBRM) in the Paper with QBM #95
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
【论文复现:将论文中的基于能量的奖励模型(EBRM)替换为QBM】
任务描述:奖励模型(RMs)对于将大语言模型(LLMs)与人类偏好对齐至关重要,然而它们往往难以捕捉复杂的人类偏好,并难以泛化至未见数据。本任务需将论文《Energy-Based Reward Models for Robust Language Model Alignment》中提到的基于能量的奖励模型(EBRM)中的Energy Score模块替换为QBM,并使用文章中提到的数据集进行结果对比验证。
论文中EBRM模型中的energy score算法的传入参数为RM模型传出的特征值'embedding'以及RM模型的打分'r', 传出的参数为'r*'作为修正的打分值。我们使用QBM替换energy score同样使用相同的参数传递方法来进行修正打分。
模型训练部分我们对两个数据集均训练了5个epochs并进行数据的可视化(参考example/qbm_ebrm_results/imgs/文件夹或example/qbm_ebrm_results/README.md)。
文件变更:
close #78
参考文献: