How to use pjit to speed up divide and conquer algorithm? #9361

AdrienCorenflos · 2022-01-28T09:17:09Z

AdrienCorenflos
Jan 28, 2022

Hi,

I have a divide and conquer algorithm that essentially generalises prefix sums (jax.lax.associative_scan) to non associative operators whilst retaining the logarithmic span complexity (an logarithmic compile time) of prefix sums in the size of the input data. The algorithm, written in an in-place fashion goes roughly like this:

    1. Pad `inputs` of shape (T, ...) in the first dimension to the nearest power of 2, resulting in length T=2^K
    2. For k=0...K-1
        a. Reshape `inputs` to (T/2^k, 2^k, ...)
        b. Split `inputs` between even and odd indices
            `even_elems = inputs[2t, t=0...T/2^k-2]`
            `odd_elems = inputs[2t+1, t=0...T/2^k-2]`
        c. Combine `even_elems` and `odd_elems` in parallel along the first dimension
            In parallel do:
            `inputs[t] = operator(even_elems[t], odd_elems[t]), t=0...T/2^k-1`
    3. Return `inputs`

While it works really well on a single GPU, the logarithmic scaling stops happening when T becomes too big. Intuitively I tend think that it can be made compatible with pjit to make the scaling continue past the number of threads on a given GPU. However, I am at a complete loss when trying to use it. I have been trying different combinations of meshes and partition specs, to, so far, no avail (see figure).

I would very much appreciate (and acknowledge in coming paper using this algorithm) if someone could help me fix this issue. My code is available on Colab (although I think TPU pods are not compatible with PJIT on Colab yet?) at the following address https://colab.research.google.com/drive/1frM5UgGlmky2nbpJCvPSkQIzgSt9hCis?usp=sharing. The tentative PJIT is in the function dc_map_pjit.

Also, on a side note, if the JAX team is interested, I could contribute the divide and conquer algorithm to the code base.

Thanks a lot to whomever will be kind enough to read up to here

Adrien

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to use pjit to speed up divide and conquer algorithm? #9361

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

How to use pjit to speed up divide and conquer algorithm? #9361

Uh oh!

AdrienCorenflos Jan 28, 2022

Replies: 0 comments

AdrienCorenflos
Jan 28, 2022