Question about custom_jvp when using sparse.BCOO #8936

DoTulip · 2021-12-14T16:53:45Z

DoTulip
Dec 14, 2021

Hello JAX team：
I am trying to use BCOO to carry out some calculations. In this process, I need to use the custom_jvp module to customize the gradient of the operation. But I found that the gradient calculation results using BCOO and dense matrix are different. Why is this? How can I complete a custom gradient with BCOO? (This is the first time I use the BCOO module, it may be that I have not grasped the correct way to use it)

Here's the code to reproduce:

import jax.numpy as jnp  
from jax import grad, value_and_grad, custom_jvp
import jax.lax as jl
import jax.experimental.sparse as jes

@custom_jvp
def funD(matrix_A, matrix_B):
    out = matrix_A @ matrix_B
    out = out.sum()
    return out


@funD.defjvp
def funD_jvp(primals, tangents):
    matrix_A, matrix_B = primals
    mA_dot, mB_dot = tangents
    v = jnp.array([1, 2, 3, 4]).T
    primal_out = funD(matrix_A, matrix_B)
    tangent_out = -v.T @ mA_dot * v * 2 + v.T @ mB_dot * v
    return primal_out, tangent_out.sum()

if __name__ == '__main__':
    matrix_B_use = jnp.ones([4, 4])
    data = jnp.array([1., 2., 3., 4.])
    iK = jnp.array([0, 1, 2, 3])
    jK = jnp.array([0, 1, 2, 3])
    idx = (iK, jK)
    indices = jnp.array([[0, 0], [1, 1], [2, 2], [3, 3]]) 

    def fun_sparse(x):
        data_use = x * data
        mat = jes.BCOO((data_use, indices), shape=(4, 4))
        obj = funD(mat, matrix_B_use)
        return obj

    def fun_dense(x):
        data_use = x * data
        mat = jnp.zeros([4, 4])
        mat = mat.at[idx].add(data_use)
        obj = funD(mat, matrix_B_use)
        return obj

    x = jnp.array([1., 2., 3., 4.])
    obj_sp, gradout_sp = value_and_grad(fun_sparse)(x)
    print("obj_sp:", obj_sp)  # print: obj_sp: 120.0
    print("dc_sp:", gradout_sp) # print: dc_sp: [-2. -4. -6. -8.]
    obj_dense, gradout_dense = value_and_grad(fun_dense)(x)
    print("obj_dense:", obj_dense)  # print: obj_dense: 120.0
    print("dc_dense:", gradout_dense) # print: dc_dense: [  -2.  -16.  -54. -128.]

Answered by jakevdp

Dec 14, 2021

I believe that the difference here is that the dense JVP is perturbing every element of the array, while the sparse JVP is perturbing only defined elements in the array. In your example, mA_dot has 16 elements for the dense version, and 4 elements for the sparse version. When you eventually sum those, the results will be different.

This is by design: otherwise, taking the grad of a sparse matrix would require instantiating a dense matrix of the same size, which is problematic in many applications.

View full answer

jakevdp · 2021-12-14T17:26:19Z

jakevdp
Dec 14, 2021
Maintainer

I believe that the difference here is that the dense JVP is perturbing every element of the array, while the sparse JVP is perturbing only defined elements in the array. In your example, mA_dot has 16 elements for the dense version, and 4 elements for the sparse version. When you eventually sum those, the results will be different.

This is by design: otherwise, taking the grad of a sparse matrix would require instantiating a dense matrix of the same size, which is problematic in many applications.

5 replies

DoTulip Dec 15, 2021
Author

Thank you for the kind answer! Sparse matrix is a commonly used function in scientific computing, and I think many people will use it. May you add some content to the document to tell users how to define custom_jvp correctly when using BCOO?

DoTulip Dec 15, 2021
Author

By the way, will JAX support more scipy.sparse operations in the future? The use of a sparse matrix will significantly improve the efficiency of the code.

jakevdp Dec 15, 2021
Maintainer

I don't know how to correctly define custom JVP for sparse matrices. I don't know of anyone who has ever tried this beside you... perhaps given your experience, you may be the best person to write those docs.

DoTulip Dec 17, 2021
Author

I am very sorry to have bothered you so many times on this issue. It is a pity that I did not successfully achieve the desired function. I have read some related studies, and they say that if the input of the function is in sparse form, this sparseness should also be maintained when defining the automatic differentiation rules (I am not sure whether my statement is accurate). Regardless of whether JVP or VJP is used, JAX does not support defining the output of a function as a sparse form.

In my research, I need to solve the eigenvalues of high-dimensional matrices. If the dense matrix form is adopted, it will cause a great memory burden, so I hope to be able to adopt the sparse matrix form. In Scipy, I can use the scipy.sparse.linalg.eigsh module to achieve my needs. Will JAX implement similar functions in the future?

JAX has greatly facilitated my research, thank you for your hard work! Hope that JAX can achieve more powerful functions in the future!

jakevdp Dec 17, 2021
Maintainer

Regardless of whether JVP or VJP is used, JAX does not support defining the output of a function as a sparse form.

I don't think this statement is True... it's certainly true that I have not thought deeply about how to do it (nor has anyone else to my knowledge) but I'm confident that with some thought, JAX's custom JVP/VJP approaches can be made to work with sparse inputs and/or outputs.

In Scipy, I can use the scipy.sparse.linalg.eigsh module to achieve my needs. Will JAX implement similar functions in the future?

Yes, it would certainly be possible to implement sparse eigenvalue solvers in JAX, but they're not yet part of the package. See #3112 and #4336 for some explorations in this area, and see https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.sparse.linalg.gmres.html for related iterative solvers that are already part of JAX.

SNMS95 · 2023-07-20T07:41:09Z

SNMS95
Jul 20, 2023

@DoTulip ,

Did you figure out how to do a custom_jvp for sparse operations?
I am also working solving very large matrices, so I would like to use the sparse form to save memory.
Specifically, I want to create explicit matrix-vector products which can be fed to any linear solver.

# Pseudo-code
1. def func(original_numpy_array_data):
           return BCOO_sparse_array
2. Wrap func with pure_callback
3. Wrap func with custom_vjp 
4.  Call func to create matrix
5. Create mat-vec function

2 replies

jakevdp Jul 20, 2023
Maintainer

A new answer on an already answered years-old question is not a great place to get visibility for a new question. You might consider opening a new discussion topic.

SNMS95 Jul 20, 2023

Hi Jake, I understand. I had no other way of contacting the op to see if he found an alternative.
Thanks for all your help btw!

Question about custom_jvp when using sparse.BCOO #8936

Uh oh!

DoTulip Dec 14, 2021

Replies: 2 comments · 7 replies

Uh oh!

jakevdp Dec 14, 2021 Maintainer

Uh oh!

DoTulip Dec 15, 2021 Author

Uh oh!

DoTulip Dec 15, 2021 Author

Uh oh!

jakevdp Dec 15, 2021 Maintainer

Uh oh!

DoTulip Dec 17, 2021 Author

Uh oh!

Uh oh!

jakevdp Dec 17, 2021 Maintainer

Uh oh!

Uh oh!

SNMS95 Jul 20, 2023

Uh oh!

Uh oh!

jakevdp Jul 20, 2023 Maintainer

Uh oh!

SNMS95 Jul 20, 2023

DoTulip
Dec 14, 2021

Replies: 2 comments 7 replies

jakevdp
Dec 14, 2021
Maintainer

DoTulip Dec 15, 2021
Author

DoTulip Dec 15, 2021
Author

jakevdp Dec 15, 2021
Maintainer

DoTulip Dec 17, 2021
Author

jakevdp Dec 17, 2021
Maintainer

SNMS95
Jul 20, 2023

jakevdp Jul 20, 2023
Maintainer