Fori_loop array indexing issue #12432

guglielmogattiglio · 2022-09-20T16:29:21Z

guglielmogattiglio
Sep 20, 2022

Hi, I am trying to implement mini-batch gradient descent in JAX, however I encounter the following error when indexing inside the function batch_step() (below):

Array slice indices must have static start/stop/step to be used with NumPy indexing syntax

The way I compute indexes is by relying on the counter i obtained from jax fori_loop, and a static value. The issue, as far as I understand, is that i is seen as a traced array. I am not sure why this happens: given that both end points of the fori_loop are static I would assume its counter to be known in advance (and hence static). Anyway, here is a working example.

import jax
import jax.numpy as jnp
import optax
import numpy as np

def lin_model(params, x):
    return jnp.dot(params, x)

def loss_mse(params, x, y, model):
    yh = model(params, x)
    mse = jnp.mean((yh - y)**2, axis=1)
    return jnp.sqrt(jnp.sum(mse**2))

def fit_adam_sgd(params, x, y, model, loss_fn, epochs, lr, batch_size):
    
    def update(params, x, y, opt_state):
        loss, grads = jax.value_and_grad(loss_fn)(params, x, y, model)
        updates, opt_state = optimizer.update(grads, opt_state, params)
        params = optax.apply_updates(params, updates)
        return loss, updates, grads, opt_state
        
    def batch_step(i, carry):
        batches = carry[2]
        loss, updates, grads, opt_state = update(carry[0], 
                                                 x[..., i*batches:(i+1)*batches], 
                                                 y[..., i*batches:(i+1)*batches], 
                                                 carry[1])
        carry[0] = carry[0] + updates
        carry[1] = opt_state
        return carry
    
    def epoch_step(i, carry):
        carry = jax.lax.fori_loop(0, carry[2], batch_step, carry)
        return carry
   
    optimizer = optax.adam(learning_rate=lr)
    opt_state = optimizer.init(params)
    batches = int(x.shape[-1]/batch_size)
    carry = [params, opt_state, batches]
    res = jax.lax.fori_loop(0, epochs, epoch_step, carry)
    return  res[0]

np.random.seed(40)
n = 100
x = np.random.rand(3, n)
x[2, :] = 1.0
y = (2 + x[0,:] + x[1,:]).reshape(1, -1)
params = np.random.rand(y.shape[0], x.shape[0])

jit_fit_adam_sgd = jax.jit(fit_adam_sgd, static_argnums=(3,4), static_argnames=('epochs', 'batch_size'))
w = jit_fit_adam_sgd(params, x, y, lin_model, loss_mse, epochs=10000, lr=0.04, batch_size=10)
print(loss_mse(w, x, y, lin_model))

In words, I have two loops for mini-batch GD, the outer is over the number of epochs (considered static), the inner over the batches (here kept fixed but in general easy to shuffle the observations). Let us ignore that some observations may be left out from training by the current implementation, when x.shape[-1] mod batch_size != 0. I try to learn a simple linear regression.

Final note. If I replace the epoch step with

def epoch_step2(i, carry):
	loss, updates, grads, opt_state = update(carry[0], x, y, carry[1])
	carry[0] = carry[0] + updates
	carry[1] = opt_state
	return carry

thereby implementing batch GD, it works (as a proof of correctness).

How can I get around the indexing error above? Thanks!

Answered by jakevdp

Sep 20, 2022

The counter i is in fact traced: the way that fori_loop works is to trace/compile the body function once in order to determine its behavior for abstract values of i, and then run that compiled code in sequence for every value of i. I suspect you could do what you want by replacing the numpy-style slicing with a call to lax.dynamic_slice. Here's an example of how the two APIs compare:

import jax.numpy as jnp
from jax import lax

x = jnp.arange(240).reshape(2, 3, 40)

batch_size = 4
i = 1

out_static = x[..., i * batch_size: (i + 1) * batch_size]  # all indices must be static
print(out_static)
# [[[  4   5   6   7]
#   [ 44  45  46  47]
#   [ 84  85  86  87]]

#  [[124 125 126 127]
#   [164…

View full answer

jakevdp · 2022-09-20T16:42:12Z

jakevdp
Sep 20, 2022
Maintainer

The counter i is in fact traced: the way that fori_loop works is to trace/compile the body function once in order to determine its behavior for abstract values of i, and then run that compiled code in sequence for every value of i. I suspect you could do what you want by replacing the numpy-style slicing with a call to lax.dynamic_slice. Here's an example of how the two APIs compare:

import jax.numpy as jnp
from jax import lax

x = jnp.arange(240).reshape(2, 3, 40)

batch_size = 4
i = 1

out_static = x[..., i * batch_size: (i + 1) * batch_size]  # all indices must be static
print(out_static)
# [[[  4   5   6   7]
#   [ 44  45  46  47]
#   [ 84  85  86  87]]

#  [[124 125 126 127]
#   [164 165 166 167]
#   [204 205 206 207]]]

out_dynamic = lax.dynamic_slice(
    x,
    start_indices=(0, 0, i * batch_size),  # Note: start_indices may be dynamic
    slice_sizes=(*x.shape[:-1], batch_size))  # slice sizes must be static
print(out_dynamic)
# [[[  4   5   6   7]
#   [ 44  45  46  47]
#   [ 84  85  86  87]]

#  [[124 125 126 127]
#   [164 165 166 167]
#   [204 205 206 207]]]

If you use dynamic_slice within your loop, it should address the concretization error.

1 reply

guglielmogattiglio Sep 21, 2022
Author

Thank you for your answer, indeed it does! For completeness, I attach the edited function which includes your input and makes the above work.

def batch_step(i, carry):
    batches = carry[2]
    x_batch = lax.dynamic_slice(x, start_indices=(0,)*(x.ndim-1) + (i*batches,),
                                  slice_sizes=(*x.shape[:-1], batch_size))
    y_batch = lax.dynamic_slice(y, start_indices=(0,)*(y.ndim-1) + (i*batches,),
                                  slice_sizes=(*y.shape[:-1], batch_size))

    loss, updates, grads, opt_state = update(carry[0], 
                                             x_batch, 
                                             y_batch, 
                                             carry[1])
    carry[0] = carry[0] + updates
    carry[1] = opt_state
    return carry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fori_loop array indexing issue #12432

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Fori_loop array indexing issue #12432

Uh oh!

Uh oh!

guglielmogattiglio Sep 20, 2022

Replies: 1 comment · 1 reply

Uh oh!

Uh oh!

jakevdp Sep 20, 2022 Maintainer

Uh oh!

guglielmogattiglio Sep 21, 2022 Author

guglielmogattiglio
Sep 20, 2022

Replies: 1 comment 1 reply

jakevdp
Sep 20, 2022
Maintainer

guglielmogattiglio Sep 21, 2022
Author