<img width="1844" height="602" alt="Image" src="https://github.com/user-attachments/assets/393e1f86-c98e-45f3-af35-2f5cf0dc89f6" /> 观察代码, 输入图像只过了vae,似乎没有过vit输给transformer,求解答