Getting nan in grad of polynomial root finder #18745

melsophos · 2023-11-30T11:02:57Z

melsophos
Nov 30, 2023

I am trying to implement the Aberth method to find roots of a polynomial in Jax. I want a function which is differentiable and which can be jitted on GPU (to bypass #11322). Second part works, but not the first.

Below is the code I have written: it can be improved algorithmically (especially the initialization step), but for now, I want a minimal working function. The function takes the coefficients of the polymonial and computes all roots at the same time. Starting from some initial guess (here random), the function calls iteratively the function compute_offset which moves towards the solution. I have commented the loops (using for and lax.fori_loop) to clarify the origin of the problem.

To test the gradient, I have written two functions: one parametrizing the coefficients of a polynomial in terms of some parameter a, another computing some scalar out of the roots. If there is a single call to compute_offsets in the root finding algorithm, grad gives some number. If there are two calls as below, then grad gives nan.

I have used config.update("jax_debug_nans", True) to find the source of nan: this tells me that the division 1 / matrix gives nan (since matrix contains zeros on the diagonal) already in the first call to compute_offset. What confuses me is that grad still gives a number in this case. I tried to replace the nan_to_num by matrix.at[idx].set(1 / matrix[idx]) where idx contains non-diagonal indices selected with argwhere, but it does not work either (and still get the nan error when enabling the nan debug). Any pointer for solving the problem is most welcome.

import jax
import jax.random as jrd
import jax.numpy as jnp

import numpy.polynomial.polynomial as poly


seed = 100
key = jrd.PRNGKey(seed)


def polynomial(coeffs):
    return lambda x: jnp.sum(coeffs * jnp.power(x, np.arange(len(coeffs))))


def initialize_roots(coeffs):
    return jrd.normal(key, (len(coeffs) - 1,), dtype=jnp.complex64)


def polynomial_roots(coeffs):
    coeffs = jnp.array(coeffs, dtype=jnp.complex64)

    f = jnp.vectorize(polynomial(coeffs))
    fd = jnp.vectorize(jax.grad(polynomial(coeffs), holomorphic=True))

    roots = initialize_roots(coeffs)
    offsets = jnp.zeros((len(coeffs) - 1,), dtype=jnp.complex64)

    def repulsion(roots):

        matrix = jax.vmap(jax.vmap(jnp.subtract, (None, 0)), (0, None))(roots, roots)
        matrix = jnp.nan_to_num(1 / matrix, posinf=0)

        return jnp.sum(matrix, axis=-1)

    def compute_offset(roots):
        ratios = f(roots) / fd(roots)
        offsets = jnp.nan_to_num(ratios / (1 - ratios * repulsion(roots)))

        return roots - offsets, offsets

    max_iter = 15

    roots, offsets = compute_offset(roots)
    # NOTE: grad gives nan if line below is included
    roots, offsets = compute_offset(roots)

    # for i in jnp.arange(max_iter):
    #     roots, offsets = compute_offset(roots)

    # body_fn = lambda i, val: compute_offset(val[0])
    # roots, offsets = jax.lax.fori_loop(
    #     0, max_iter,
    #     body_fn,
    #     (roots, offsets)
    # )


    idx = jnp.argsort(roots)
    return roots[idx], offsets[idx]


coeffs = jrd.normal(key, (8,), dtype=jnp.complex64)
print("Jax:", polynomial_roots(coeffs))
print("numpy:", poly.polyroots(coeffs))


def coeffs_from_a(a):
    return a**2, a + 2j, 10*a, a**3 + 3

def combine(a):
    return jnp.sum(polynomial_roots(coeffs_from_a(a))[0])


jax.grad(combine, holomorphic=True)(2+1j)

f0uriest · 2023-12-02T01:55:20Z

f0uriest
Dec 2, 2023

I think what you have there is still dividing by zero but then replacing the Nan with something finite, which usually works for forward mode ad but can still cause problems with reverse mode. You might need to mask the input before dividing like

mask = matrix == 0
matrix= jnp.where(mask, 0, 1/jnp.where(mask, 1, matrix))

The inner where prevents any division by zero and the outer one puts in the correct value.

Still not sure why it would work for a single iteration but not for two though.

0 replies

melsophos · 2023-12-02T16:53:19Z

melsophos
Dec 2, 2023
Author

Thanks a lot for your help! I just tried your code and it works perfectly.
I thought the problem was related to this entry in the FAQ, but I had not managed to make where works, now I understand better (admittedly, I had given up that approach quickly since I had no where, and it was working with a single call).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Getting nan in grad of polynomial root finder #18745

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Getting nan in grad of polynomial root finder #18745

Uh oh!

melsophos Nov 30, 2023

Replies: 2 comments

Uh oh!

Uh oh!

f0uriest Dec 2, 2023

Uh oh!

melsophos Dec 2, 2023 Author

melsophos
Nov 30, 2023

f0uriest
Dec 2, 2023

melsophos
Dec 2, 2023
Author