Skip to content

Zygote with Tullio gives wrong gradients/pullbacks using CUDA #185

@kc-111

Description

@kc-111
using Tullio, Zygote, CUDA, KernelAbstractions, OMEinsum

# Show outer product of strings
A = ["x", "y", "z", "w"]
res = Array{String}(undef, length(A), length(A))
for (i, r) in enumerate(A)
    for (j, c) in enumerate(A)
        res[i, j] = string(r, c)
    end
end
display(res)

# Test outer product using einstein summation
A = rand(length(A), 100) # Last dim is batch
batchmul(A, B) = @tullio C[i,j,k] := A[i,k] * B[j,k]
# batchmul(A, B) = ein"ik,jk->ijk"(A, B)
outer_prod(A, B) = reshape(batchmul(A, B), size(A, 1)*size(B, 1), size(A, 2))
@show reshape(outer_prod(A, A), 4, 4, :) == batchmul(A, A)
(loss,), back = pullback(p -> sum(outer_prod(p, p)), A)
gs = back((one(loss)))[1]
display(gs)

# Cuda
A_cu = CuArray(Float32.(A))
(loss,), back = pullback(p -> sum(outer_prod(p, p)), A_cu)
gs = back((one(loss)))[1]
display(gs)

Using OMEinsum with CUDA gives consistent and correct results.
Problem: The pullback gives different results when I use CUDA with Tullio.
Discourse Discussion: https://discourse.julialang.org/t/zygote-with-tullio-gives-wrong-gradients-pullbacks-using-cuda/110767

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions