Hello!
I am trying to recreate some of the experiments here with the CUDA-based implementations. I have not been able to find any information on the PyTorch, CUDA, or other dependencies used within repo - nothing is specified in the paper either.
It would be really helpful if this could be provided; I have only gotten the CPU-only versions working but it is prohibitively slow in most cases.
Great work! Thank you