Problem Description
Docker builds on the Slurm controller node fail with no space left on device error. The root volume has only 50GB capacity left by default, leaving insufficient space for building Docker images.
Impact
This issue affects multiple workshops and documentation that require Docker image builds, causing build failures and blocking users from completing the tutorials.
Affected Resources
Workshop: Picotron Docker Setup
Repository: NCCL Tests
Proposed Solutions
Change containerd root to EBS volume such as /opt/sagemaker