Best practices for checkpointing under grad? #16804

paulbricman · 2023-07-20T16:32:06Z

paulbricman
Jul 20, 2023

Thanks for the awesome module.

I'm working on a project involving meta-learning, where I have an inner optimization loop and an outer one. In the outer one, I'm essentially evaluating jax.value_and_grad(inner_loop) and updating the outer loop parameters based on the resulting gradients.

However, I'd like to checkpoint the parameters being optimized as part of the inner loop. If I try this, I run into the following error:

ValueError: IO callbacks do not support JVP.

Unfortunately, while they are located outside the grad computation of the inner loop, they are located inside the grad computation of the outer one. In the docs, I learn that "Like pure_callback, io_callback fails under automatic differentiation if it is passed a differentiated variable."

In this context, what is the best way for doing an io_callback from inside the grad of the outer loop, while still using a differentiated variable as an argument (for checkpointing)?

Potential options:

jax.debug.callback seems more lenient, but does it hurt performance much?
Somehow return the inner loop params at the outer level and checkpoint there?

patrick-kidger · 2023-07-23T19:39:13Z

patrick-kidger
Jul 23, 2023

Wrap your io_callback inside a jax.custom_jvp or jax.custom_vjp, specifying what behaviour you'd like it to have when differentiated.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Best practices for checkpointing under grad? #16804

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Best practices for checkpointing under grad? #16804

Uh oh!

paulbricman Jul 20, 2023

Replies: 1 comment

Uh oh!

patrick-kidger Jul 23, 2023

paulbricman
Jul 20, 2023

patrick-kidger
Jul 23, 2023