Update orbax guide by amitsrivastava78 · Pull Request #2308 · keras-team/keras-io

amitsrivastava78 · 2026-03-04T06:20:19Z

Replaces the existing Orbax checkpointing guide (which required users to define and copy-paste custom KerasOrbaxCheckpointManager and OrbaxCheckpointCallback wrapper classes) with a comprehensive guide for the built-in keras.callbacks.OrbaxCheckpoint callback.

What changed

Removed ~100 lines of manual wrapper boilerplate — the built-in callback handles everything.
All examples use the public keras.callbacks.OrbaxCheckpoint API directly.

Sections covered

Basic Usage — drop-in callback with model.fit()
Loading a model — keras.saving.load_model()
Loading weights only — model.load_weights()
Resuming training — step recovery from optimizer iterations
Save best only — monitor, mode, save_best_only
Batch-level checkpointing — save_freq=N
Distributed training — Keras Distribution API with ModelParallel
Cross-layout resharding — load under a different LayoutMap
Callback parameters reference — full parameter table

gemini-code-assist · 2026-03-04T06:20:35Z

Summary of Changes

Hello, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request overhauls the Orbax checkpointing guide for Keras, transitioning from a custom callback approach to utilizing the newly integrated keras.callbacks.OrbaxCheckpoint. The update provides a more streamlined and feature-rich experience for users, offering extensive examples that demonstrate its capabilities across various scenarios, including complex distributed training setups and flexible cross-layout resharding. The aim is to provide clear, practical guidance for saving and restoring model states efficiently within the JAX backend.

Highlights

Updated Orbax Checkpointing Guide: The guide for Orbax checkpointing in Keras has been significantly updated to reflect the usage of the built-in keras.callbacks.OrbaxCheckpoint.
Removal of Custom Implementations: The previously required custom KerasOrbaxCheckpointManager and OrbaxCheckpointCallback classes have been removed, simplifying the integration of Orbax checkpointing.
Expanded Examples and Features: New sections and examples have been added covering basic usage, loading models and weights, resuming training, save_best_only functionality, batch-level checkpointing, distributed training with model parallelism, and cross-layout resharding.
Distributed Training and Resharding: The guide now includes detailed demonstrations of how OrbaxCheckpoint works seamlessly with the Keras Distribution API, supporting multi-host environments and automatic resharding of checkpoints when loading under different layout maps.
Callback Parameters Reference: A comprehensive table outlining all available parameters for the keras.callbacks.OrbaxCheckpoint has been included for easy reference.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog

guides/orbax_checkpoint.py
- Replaced custom KerasOrbaxCheckpointManager and OrbaxCheckpointCallback with the built-in keras.callbacks.OrbaxCheckpoint.
- Expanded the guide with new sections on basic usage, loading models, loading weights, resuming training, save_best_only, and batch-level checkpointing.
- Added comprehensive examples for distributed training with model parallelism, including DeviceMesh and LayoutMap configurations.
- Demonstrated cross-layout resharding capabilities for loading checkpoints under different sharding layouts.
- Included a reference table for OrbaxCheckpoint callback parameters.
- Updated guide metadata, including author, creation, and last modified dates, and a more descriptive summary.
- Modified the example model architecture and training data size for clarity and robustness.
- Added JAX-specific environment variable settings for simulating multi-device environments.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request provides a comprehensive update to the Orbax checkpointing guide. The previous version, which relied on custom-built callback classes, has been replaced with a new guide that leverages the official, built-in keras.callbacks.OrbaxCheckpoint. The updated guide is much more thorough, covering basic usage, resuming training, distributed training with the Keras Distribution API, and cross-layout resharding. The changes significantly improve the quality and usefulness of the guide. I have one minor suggestion to reduce code duplication for improved clarity.

guides/orbax_checkpoint.py

…yout shapes; regenerate .ipynb and .md

amitsrivastava78 · 2026-03-04T07:03:37Z

@hertschuh The guide is ready for review. PTAL

hertschuh

Approved with a couple nitpicks:

guides/orbax_checkpoint.py

…dels

amitsrivastava78 · 2026-03-04T19:15:15Z

Approved with a couple nitpicks:

Thanks for approving, have updated as per the comments

amitsrivastava78 added 3 commits March 4, 2026 11:09

Update Orbax Checkpointing guide for built-in OrbaxCheckpoint callback

10de53d

Fix: move XLA_FLAGS before framework imports to avoid locked topology

d05d20d

Fix typo in XLA_FLAGS comment

55b672a

amitsrivastava78 requested review from MarkDaoust and fchollet as code owners March 4, 2026 06:20

github-actions bot assigned sachinprasadhs Mar 4, 2026

gemini-code-assist bot reviewed Mar 4, 2026

View reviewed changes

guides/orbax_checkpoint.py Outdated Show resolved Hide resolved

amitsrivastava78 added 2 commits March 4, 2026 11:57

Remove duplicate get_distributed_model; reuse get_model

8cfc2b9

Fix guide for execution: add cleanup, initial_epoch, keyword args, la…

d66ae8f

…yout shapes; regenerate .ipynb and .md

hertschuh approved these changes Mar 4, 2026

View reviewed changes

guides/orbax_checkpoint.py Outdated Show resolved Hide resolved

guides/orbax_checkpoint.py Outdated Show resolved Hide resolved

guides/orbax_checkpoint.py Show resolved Hide resolved

guides/orbax_checkpoint.py Outdated Show resolved Hide resolved

Use jax.config for CPU devices, say Keras 3.14, verify both loaded mo…

026b3b6

…dels

hertschuh approved these changes Mar 4, 2026

View reviewed changes

hertschuh merged commit 73e0ac8 into keras-team:master Mar 4, 2026
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update orbax guide#2308

Update orbax guide#2308
hertschuh merged 6 commits intokeras-team:masterfrom
amitsrivastava78:update-orbax-guide

amitsrivastava78 commented Mar 4, 2026 •

edited

Loading

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

amitsrivastava78 commented Mar 4, 2026

Uh oh!

hertschuh left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amitsrivastava78 commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

amitsrivastava78 commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot commented Mar 4, 2026

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

amitsrivastava78 commented Mar 4, 2026

Uh oh!

hertschuh left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

amitsrivastava78 commented Mar 4, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amitsrivastava78 commented Mar 4, 2026 •

edited

Loading