Can't this code train Llama2 with just one GPU? I keep getting the error that the cuda memory is insufficient. I'm using an A800.