Thanks for the good job.
but this project seems work for video synthesize only. and I viewed the related code on OpenMMDiT project, which is also for video synthesize. Is there any common DiT blocks formulated for common using? eg, for texture or audio tasks.