Skip to content

[Feature][P1]: Investigate and Implement Zstd Compression for Docker ImagesΒ #28656

@rzabarazesh

Description

@rzabarazesh

πŸš€ The feature, motivation and pitch

Description

Docker images currently use gzip compression by default. Zstd (Zstandard) is a modern compression algorithm developed by Facebook/Meta that offers better compression ratios and significantly faster decompression speeds compared to gzip. This could reduce image sizes and pull times, especially for the large vLLM images (~30GB). This task investigates whether switching to zstd compression provides measurable benefits.

What You'll Do

  1. Research Phase :

    • Understand Docker's compression options and BuildKit support
    • Review zstd compression levels (1-19, with 3 as default)
    • Research ECR compatibility with zstd-compressed layers
    • Check if containerd/Docker runtime supports zstd decompression
  2. Benchmarking Phase :

    • Build vLLM image with default gzip compression
    • Rebuild with zstd at various compression levels (3, 9, 15)
    • Measure: image size, build time, push time, pull time, decompression time
    • Test with both cold cache and warm cache scenarios
    • Benchmark on actual CI instances (g6.4xlarge)
  3. Implementation Phase (if justified):

    • Update Dockerfile and BuildKit configuration
    • Modify CI pipeline to use zstd compression
    • Document compression settings
    • Deploy to staging and monitor

Deliverables

  • Research document on zstd vs gzip for Docker images
  • Comprehensive benchmark results (table with all metrics)
  • Recommendation: which compression level to use (or stay with gzip)
  • Updated Dockerfile with zstd configuration (if implementing)
  • CI pipeline changes (if implementing)
  • Before/after comparison in production

Alternatives

No response

Additional context

No response

Before submitting a new issue...

  • Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the documentation page, which can answer lots of frequently asked questions.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions