How to do in-place parameter update? #12625

imoneoi · 2022-10-03T13:11:25Z

imoneoi
Oct 3, 2022

A common training pattern is to do x = update(x) repeatedly (x is network parameters, update is jitted update function), like below.

However, this update copies the parameter, which doubles memory usage, can we do this update in-place?

@jax.jit
def update(x):
    return x + 1


def main():
    # x takes 2GB memory
    x = jnp.zeros((2 * 1024 * 1024 * 1024 // 4), dtype=jnp.float32)

    # A training loop
    for _ in range(10):
        x = update(x)

BufferAssignment OOM Debugging.
BufferAssignment stats:
             parameter allocation:    2.00GiB
              constant allocation:         0B
        maybe_live_out allocation:    2.00GiB
     preallocated temp allocation:         0B
                 total allocation:    4.00GiB
              total fragmentation:         0B (0.00%)
Peak buffers:
	Buffer 1:
		Size: 2.00GiB
		Entry Parameter Subshape: f32[536870912]
		==========================

	Buffer 2:
		Size: 2.00GiB
		Operator: op_name="jit(update)/jit(main)/add" source_file="test_jax_inplace.py" source_line=9
		XLA Label: fusion
		Shape: f32[536870912]
		==========================

Answered by tomhennigan

Oct 3, 2022

This is supported via "buffer donation", change your update function to the following:

def update(x):
    return x + 1

update = jax.jit(update, donate_argnums=0)

View full answer

tomhennigan · 2022-10-03T13:13:22Z

tomhennigan
Oct 3, 2022

This is supported via "buffer donation", change your update function to the following:

def update(x):
    return x + 1

update = jax.jit(update, donate_argnums=0)

0 replies

imoneoi · 2022-10-03T14:04:50Z

imoneoi
Oct 3, 2022
Author

Thanks! I used buffer donation, but my program stuck there forever with 0% GPU utilization. How can I find the problem?

2 replies

tomhennigan Oct 3, 2022

This happens when you accidentally donate the same buffer twice, e.g.:

def f(a, b):
  return a + b

f = jax.jit(f, donate_argnums=0)

x = jnp.ones([])
f(x, x)  # donating and using `x`.

To fix, make a copy of the input you want to pass twice:

f(x, x.copy())

I've most commonly seen this when folks have params and ema_params (initing with params = ema_params for the first step).

I've got a pending change which turns this from a deadlock into an error message, I'll prioritise getting that submitted.

imoneoi Oct 4, 2022
Author

Thanks! That solves the problem, I had two same buffers in a pytree.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to do in-place parameter update? #12625

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 2 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

How to do in-place parameter update? #12625

Uh oh!

imoneoi Oct 3, 2022

Replies: 2 comments · 2 replies

Uh oh!

tomhennigan Oct 3, 2022

Uh oh!

imoneoi Oct 3, 2022 Author

Uh oh!

tomhennigan Oct 3, 2022

Uh oh!

imoneoi Oct 4, 2022 Author

imoneoi
Oct 3, 2022

Replies: 2 comments 2 replies

tomhennigan
Oct 3, 2022

imoneoi
Oct 3, 2022
Author

imoneoi Oct 4, 2022
Author