Skip to content

[Feature][P1]: Bake Base Docker Images into EC2 AMI with Packer #28654

@rzabarazesh

Description

@rzabarazesh

🚀 The feature, motivation and pitch

Description

Currently, every build instance pulls the base Docker images (nvidia/cuda, Python base, PyTorch, etc.) from registries on first use. These base images are large (~8-10GB) and rarely change. By pre-pulling and caching these images in the EC2 AMI using Packer, instances start with a warm cache, eliminating the initial pull time and reducing build times, especially for new/cold instances.

What You'll Do

  1. Analyze which base images are most frequently used:
    • nvidia/cuda:12.9.1-devel-ubuntu20.04
    • nvidia/cuda:12.9.1-runtime-ubuntu22.04
    • PyTorch images from download.pytorch.org
    • Python base images
  2. Modify Packer configuration to:
    • Pull base images during AMI build
    • Store them in /var/lib/docker
    • Verify images are properly saved in Docker cache
  3. Calculate optimal set of images to bake in (balance AMI size vs. benefit)
  4. Set up AMI update automation:
    • Rebuild AMI monthly or when base images update
    • Version AMI appropriately
    • Update Terraform to use new AMI
  5. Test that instances launch with pre-cached images
  6. Document AMI maintenance procedures

Deliverables

  • Analysis document: which base images to pre-cache and why
  • Modified Packer configuration with image pre-caching
  • Script to identify and pull latest base image versions
  • AMI size comparison (before/after)
  • Build time comparison (cold start with/without pre-cached images)
  • Automation for monthly AMI rebuilds

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions