Add `Fine-Tuning a Vision Language Model with TRL using MPO` recipe by sergiopaniego · Pull Request #318 · huggingface/cookbook

sergiopaniego · 2025-07-21T16:41:39Z

What does this PR do?

Add Fine-Tuning a Vision Language Model with TRL using MPO recipe

Fixes #317

Who can review?

@merveenoyan and @stevhliu

review-notebook-app · 2025-07-21T16:41:43Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

HuggingFaceDocBuilderDev · 2025-07-21T16:45:59Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

review-notebook-app · 2025-07-21T18:47:56Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2025-07-21T18:47:56Z
----------------------------------------------------------------

I think this loss_type list is more appropriate in section 3.3 and you probably don't need to list all possible types - only list the ones you'll use and refer the reader to the docs for more

review-notebook-app · 2025-07-21T18:47:57Z

View / edit / reply to this conversation on ReviewNB

stevhliu commented on 2025-07-21T18:47:57Z
----------------------------------------------------------------

I think maybe this needs to be in English 😁

stevhliu

Really cool, thanks for the new recipe!

sergiopaniego · 2025-07-22T13:13:09Z

Ready for review as MPO PR in trl is already merged!

Thanks for the comments @stevhliu. I've addressed them.

Add Fine-Tuning a Vision Language Model with TRL using MPO recipe

6c2b2e7

stevhliu approved these changes Jul 21, 2025

View reviewed changes

Updated with trained model and main

67ed18a

sergiopaniego marked this pull request as ready for review July 22, 2025 13:11

merveenoyan merged commit e0fcb99 into huggingface:main Jul 23, 2025
1 check passed

sergiopaniego deleted the mpo-recipe branch July 23, 2025 15:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `Fine-Tuning a Vision Language Model with TRL using MPO` recipe#318

Add `Fine-Tuning a Vision Language Model with TRL using MPO` recipe#318
merveenoyan merged 2 commits intohuggingface:mainfrom
sergiopaniego:mpo-recipe

sergiopaniego commented Jul 21, 2025

Uh oh!

review-notebook-app bot commented Jul 21, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 21, 2025

Uh oh!

review-notebook-app bot commented Jul 21, 2025 •

edited

Loading

Uh oh!

review-notebook-app bot commented Jul 21, 2025 •

edited

Loading

Uh oh!

stevhliu left a comment

Uh oh!

sergiopaniego commented Jul 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

sergiopaniego commented Jul 21, 2025

What does this PR do?

Who can review?

Uh oh!

review-notebook-app bot commented Jul 21, 2025

Uh oh!

HuggingFaceDocBuilderDev commented Jul 21, 2025

Uh oh!

review-notebook-app bot commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Jul 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

stevhliu left a comment

Choose a reason for hiding this comment

Uh oh!

sergiopaniego commented Jul 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

review-notebook-app bot commented Jul 21, 2025 •

edited

Loading

review-notebook-app bot commented Jul 21, 2025 •

edited

Loading