Week 5: Sharded data-parallel training, distributed training optimizations Lecture: slides Seminar: slides Homework: see the homework folder