Skip to content

2025 Wrap-up: Fine-tuning Gemma with Kauldron Example ✦︎ #1414

@lanctot

Description

@lanctot

Hello everyone!

We've been hard at work this year working on OpenSpiel 2.0, which will be better than ever. Major developments have been underway to make working with language models easier. I'm looking forward to what 2026 will bring in terms of agentic models!

In the mean time, we've developed a simple example showing how you can fine-tune a Gemma model using the Kauldron training library. This example generated data from a Monte Carlo tree search bot in self-play to design a dataset of (state, action) pairs, which is then used to fine-tune a Gemma model.

You can find the example in colabs/open_spiel_gemma.ipynb. Note that it requires a TPU runtime.

It builds directly on the Gemma and Kauldron APIs, which makes it specifically tied to Gemma. We would, of course, encourage a contribution or holiday hobby project to anyone who wants to generalize the example to use huggingface API so that various models could be swapped in or out. That would make this even more accessible to students and practitioners.

Special thanks to @dhennes for designing the original version that this example is based on and @Conchylicultor for all the support with gemma and kauldron.

Enjoy, Happy Holidays, and Happy New Year from the OpenSpiel dev team! πŸ™ πŸŽ‰ ⋆꙳‒❅*β€§ β˜ƒοΈβ€§*❆ β‚Šβ‹† βœ¨βœ§Λ–Β°. πŸ’–πŸ€—

Metadata

Metadata

Assignees

No one assigned

    Labels

    announcementGeneral communication of info / discussion with users

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions