Recently I am working on a project which need training on the GPUs. I tried CPU version and found everything works really well. When I shift to GPU, the training on CUDA.jl is not well supported. The details could be related to the issue here: nzy1997/ParametricDFT.jl#24 I am wondering whether there are any plans for implementation on this direction. If not, I could submit a PR try a demo for manifold optimization on GPU (Currently I have two 3090s for enhancing the implementation).