Hello everyone:
I have a error when I run e = Wh1 + Wh2.T

because the dimension of e is N by N, in my graph, n is about 400000. so it takes 400000 x 40000 x 4/1024/1024/1024 GB (about 600G). I want to know whether batch size is useful.
Thanks in advance!