2025 Wrap-up: Fine-tuning Gemma with Kauldron Example ✦︎

Hello everyone! 

We've been hard at work this year working on OpenSpiel 2.0, which will be better than ever. Major developments have been underway to make working with language models easier. I'm looking forward to what 2026 will bring in terms of agentic models!

In the mean time, we've developed a simple example showing how you can fine-tune a [Gemma](https://github.com/google-deepmind/gemma) model using the [Kauldron](https://github.com/google-research/kauldron) training library. This example generated data from a Monte Carlo tree search bot in self-play to design a dataset of (state, action) pairs, which is then used to fine-tune a Gemma model. 

You can find the example in [colabs/open_spiel_gemma.ipynb](https://github.com/google-deepmind/open_spiel/blob/master/open_spiel/colabs/open_spiel_gemma.ipynb). Note that it requires a TPU runtime.

It builds directly on the Gemma and Kauldron APIs, which makes it specifically tied to Gemma. We would, of course, encourage a contribution or holiday hobby project to anyone who wants to generalize the example to use huggingface API so that various models could be swapped in or out. That would make this even more accessible to students and practitioners.

Special thanks to @dhennes for designing the original version that this example is based on and @Conchylicultor for all the support with gemma and kauldron.

Enjoy, Happy Holidays, and Happy New Year from the OpenSpiel dev team!  🙏 🎉  ⋆꙳•❅*‧ ☃️‧*❆ ₊⋆  ✨✧˖°. 💖🤗

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

2025 Wrap-up: Fine-tuning Gemma with Kauldron Example ✦︎ #1414

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

2025 Wrap-up: Fine-tuning Gemma with Kauldron Example ✦︎ #1414

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions