Jax 4-10x slower vs Pytorch for RL REINFORCE #7489

gebob19 · 2021-08-04T13:09:19Z

gebob19
Aug 4, 2021

Consider the two pytorch and jax functions below which are the same other than their function which computes what action to take (which in isolation take the same amount of time to compute) (i.e a, log_prob = torch_policy(obs) and a, log_prob = jax_policy(p_params, obs, key)).

def torch_rollout(env):
    np.random.seed(seed)
    env.seed(seed)
    obs = env.reset()
    log_probs = []
    rewards = []
    while True: 
        ## rollout 
        a, log_prob = torch_policy(obs) 
        a = a.numpy()

        a = np.random.choice(env.action_space.n)
        obs2, r, done, _ = env.step(a)
        if done: break 
        obs = obs2 

        log_probs.append(log_prob)
        rewards.append(r)
    
    log_probs = torch.stack(log_probs)
    r = torch.tensor(rewards)
    loss = -(log_probs * r).sum()
    return loss

def rollout(p_params, rng): 
    np.random.seed(seed)
    env.seed(seed)
    obs = env.reset()
    log_probs = []
    rewards = []
    while True:
        ## test 
        rng, key = jax.random.split(rng, 2) 
        a, log_prob = jax_policy(p_params, obs, key)
        a = a.astype(int)
        ## rollout
        
        ## for fair comparison 
        a = np.random.choice(env.action_space.n)
        obs2, r, done, _ = env.step(a)
        if done: break 
        obs = obs2 

        log_probs.append(log_prob)
        rewards.append(r)
    
    # compute loss 
    log_prob = jnp.stack(log_probs)
    r = np.stack(rewards)
    loss = -(log_prob * r).sum()
    return loss

However, when trying to backprop through them Jax is 10x slower than torch.

## loss & grad = 0.026107568740844727 seconds
loss = torch_rollout(env)
loss.backward()

VS

## loss & grad = 0.23618555545806885
loss, grad = jax.value_and_grad(jax_rollout)(params, key)

I even tried to separate the rollout and the loss computation (which is an extra forward pass for the entire batch) and it was faster, but still ~4x slower than torch's.

## loss & grad 3 = 0.07825906753540039
batch = jax_rollout2(p_params, env, key)
loss, grad = jax.jit(jax.value_and_grad(batch_jax_loss))(p_params, *batch)

I wouldn't expect such a big performance difference. Is there an optimization i'm missing or something? Thanks

mattjj · 2021-08-04T14:27:21Z

mattjj
Aug 4, 2021
Maintainer

Thanks for the question. 10x sounds suspicious!

The easiest way to tell what's going on is to grab a profile. If you provide a fully runnable repro, i.e. your full timing script, I might be able to help with that. (What backend are you running on? GPU?)

But one shot-in-the-dark guess is that you're timing compile time. That is, if you're timing this:

loss, grad = jax.jit(jax.value_and_grad(batch_jax_loss))(p_params, *batch)

That will include compile time. You might want to separate out the compile time, especially if you plan to evaluate this function more than once. Maybe something like:

jax_rollout2 = jax.jit(jax_rollout2)  # jit this function too if you haven't already
gradfun = jax.jit(jax.value_and_grad(batch_jax_loss))

tic = time.time()
batch = jax_rollout2(p_params, env, key)
loss, grad = gradfun(p_params, *batch)
loss.block_until_ready()
print('compile and first execution: ', time.time() - tic)

tic = time.time()
batch = jax_rollout2(p_params, env, key)
loss, grad = gradfun(p_params, *batch)
loss.block_until_ready()
print('second execution time: ', time.time() - tic)

1 reply

gebob19 Aug 4, 2021
Author

Yeah sorry, that was an incorrect copy pasta, I was using a pre-jit&grad-compiled function like you mentioned and got those times.

Heres the repo with the full script which should be reproducible: https://github.com/gebob19/tmp_jax/blob/main/tmp.py

I'm running it on my Mac Laptop CPU. Thanks for the fast reply! :D

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Jax 4-10x slower vs Pytorch for RL REINFORCE #7489

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Jax 4-10x slower vs Pytorch for RL REINFORCE #7489

Uh oh!

gebob19 Aug 4, 2021

Replies: 1 comment · 1 reply

Uh oh!

mattjj Aug 4, 2021 Maintainer

Uh oh!

gebob19 Aug 4, 2021 Author

gebob19
Aug 4, 2021

Replies: 1 comment 1 reply

mattjj
Aug 4, 2021
Maintainer

gebob19 Aug 4, 2021
Author