Confusion about cache of `grad(my_jitted_function)` #9590

langmore · 2022-02-16T03:42:29Z

langmore
Feb 16, 2022

Given a jitted function f, I can check cache size with f._cache_size(). Now, given df = grad(f), I assume calling df(x) populates cache somewhere. Where and how do I check its size?

@jax.jit
def f(x):
  return x**2
df = jax.grad(f)

df(1.)  # Hopefully this populates some cache.
print(f._cache_size())
==> 0  # Was expecting 1

f(1.)  # Of course this populates cache
print(f._cache_size())
==> 1  # As expected, directly calling f populates cache

Answered by mattjj

Feb 16, 2022

Thanks for the question!

It's a bit complicated, both for fundamental reasons (caching transformed versions of functions given the way JAX's tracing works) and for transient historical-path-dependent reasons.

There are two global jit caches: one for C++ dispatch and the other for dispatching handled by Python. The Python cache is fully general in the sense that it works with all manner of transformations applied to the jitted function, while the C++ cache only works with buffers-in-buffers-out dispatching, and hence not transformation-of-jit cases. (The C++ cache is populated by stealing entries from the Python cache when it can.)

The _cache_size() API (which isn't public AFAIK, is it eve…

View full answer

mattjj · 2022-02-16T05:21:02Z

mattjj
Feb 16, 2022
Maintainer

Thanks for the question!

It's a bit complicated, both for fundamental reasons (caching transformed versions of functions given the way JAX's tracing works) and for transient historical-path-dependent reasons.

There are two global jit caches: one for C++ dispatch and the other for dispatching handled by Python. The Python cache is fully general in the sense that it works with all manner of transformations applied to the jitted function, while the C++ cache only works with buffers-in-buffers-out dispatching, and hence not transformation-of-jit cases. (The C++ cache is populated by stealing entries from the Python cache when it can.)

The _cache_size() API (which isn't public AFAIK, is it even documented?) in your example queries the C++ cache only. But because that cache doesn't support the grad-of-jit case, you won't see it increase when you evaluate grad-of-jit calls, as in your example.

To see the cache grow in grad-of-jit calls, you need to dig pretty carefully:

import jax

@jax.jit
def f(x):
  return x**2
df = jax.grad(f)

def hilariously_tricky_cache_size():
  from jax._src import dispatch
  return sum(len(entries) for entries in dispatch._xla_callable.__closure__[1].cell_contents.values())

print(hilariously_tricky_cache_size())  # 0
df(1.)  # Hopefully this populates some cache.
print(hilariously_tricky_cache_size())  # 2

The reason there are two entries is that there's a cached jitted forward pass and a cached jitted backward pass.

If you'd like a public API for that, open a feature request and let's discuss it!

WDYT?

12 replies

VolodyaCO Feb 17, 2023

I am not. I am just testing on my side that jax.clear_backends() works as expected. I need to clear my cache from time to time to avoid out of memory errors. I use the hilariously_tricky_cache_size to check that I don't have anything in cache at all.

yashk2810 Feb 17, 2023
Collaborator

I think it does work as expected. We have tests for it to make sure it does! The problem is that you are accessing internal APIs here which will break as it did with this release. If you want you can look at pjit._pjit_lower_cached.cache_info but this can break again in the future (if we decide to change the name or something else).

That being said, we will work on a public API.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Confusion about cache of `grad(my_jitted_function)` #9590

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 12 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Confusion about cache of grad(my_jitted_function) #9590

Uh oh!

langmore Feb 16, 2022

Replies: 1 comment · 12 replies

Uh oh!

Uh oh!

mattjj Feb 16, 2022 Maintainer

Uh oh!

VolodyaCO Feb 17, 2023

Uh oh!

Uh oh!

yashk2810 Feb 17, 2023 Collaborator

Uh oh!

VolodyaCO Jul 5, 2023

Uh oh!

patrick-kidger Jul 5, 2023

Uh oh!

VolodyaCO Jul 6, 2023

Confusion about cache of `grad(my_jitted_function)` #9590

langmore
Feb 16, 2022

Replies: 1 comment 12 replies

mattjj
Feb 16, 2022
Maintainer

yashk2810 Feb 17, 2023
Collaborator