### Ideas - [ ] clip gradients - [ ] remove skip connections between SWIN encoder and UNet decoder - [ ] try diffusion with pre-trained SWIN - [ ] train with weight decay ### General Todos - [ ] get gradient logging to work - [ ] merge most branches into main