Skip to content

Pre-Compression During ExtractionΒ #14

@leafo

Description

@leafo

Add gzip pre-compression support during file extraction so that files are stored compressed in the bucket, eliminating the need for CDN-level compression.

Configuration

Settings (in zipserver.json):

  • PreCompressEnabled (bool) - Enable/disable pre-compression (default: false)
  • PreCompressExtensions ([]string) - Extensions to compress, defaults to:
    • .html, .js, .css, .svg, .wasm, .wav, .glb, .pck
  • PreCompressMinSize (int64) - Minimum file size in bytes (default: 1024 = 1KB)

Implementation Steps

1. Add configuration fields (zipserver/config.go:162-189)

Add to Config struct:

PreCompressEnabled    bool     `json:",omitempty"`
PreCompressExtensions []string `json:",omitempty"`
PreCompressMinSize    int64    `json:",omitempty"`

Add to defaultConfig (line 202):

PreCompressMinSize: 1024, // 1KB
PreCompressExtensions: []string{".html", ".js", ".css", ".svg", ".wasm", ".wav", ".glb", ".pck"},

2. Add compression helper (zipserver/compress.go - new file)

package zipserver

// shouldPreCompress checks if a file should be pre-compressed based on config
func shouldPreCompress(filename string, size uint64, config *Config) bool

// gzipCompress compresses data using gzip
func gzipCompress(data []byte) ([]byte, error)

Logic for shouldPreCompress:

  • Return false if !config.PreCompressEnabled
  • Return false if size < config.PreCompressMinSize
  • Return false if extension not in PreCompressExtensions
  • Return false if already compressed (ends with .gz, .br, .zip, etc.)
  • Return true otherwise

3. Integrate into extraction (zipserver/archive.go:421-501)

In extractAndUploadOne(), after MIME detection (line ~478) and before upload (line 486):

// Pre-compress if configured and applicable
if resource.contentEncoding == "" && shouldPreCompress(key, file.UncompressedSize64, a.Config) {
    // Read all remaining data
    data, err := io.ReadAll(reader)
    if err != nil {
        return UploadFileResult{Error: err, Key: key}
    }

    // Compress
    compressed, err := gzipCompress(data)
    if err != nil {
        return UploadFileResult{Error: err, Key: key}
    }

    // Only use compressed if it's actually smaller
    if len(compressed) < len(data) {
        reader = bytes.NewReader(compressed)
        resource.contentEncoding = "gzip"
        resource.size = uint64(len(compressed))
        // Note: skip limitedReader since we already have the full data
    } else {
        reader = bytes.NewReader(data)
    }
}

4. Add tests (zipserver/compress_test.go - new file)

  • TestShouldPreCompress - extension matching, size threshold, already-compressed
  • TestGzipCompress - valid output, decompresses correctly

Files to Modify

  • zipserver/config.go - Add 3 config fields + defaults
  • zipserver/archive.go - ~15 lines in extractAndUploadOne()
  • zipserver/compress.go (new) - ~50 lines
  • zipserver/compress_test.go (new) - ~80 lines

Edge Cases

  • Skip if contentEncoding already set (pre-compressed in zip)
  • Skip already-compressed extensions (.gz, .br, .zip, .png, .jpg, .gif, .webp)
  • Only use compression if result is smaller (avoid bloating tiny files)

Actual commonly used file extensions to re-evaluate what to compress:

 extension | file_count | count_rank | total_size | size_rank
-----------+------------+------------+------------+-----------
 png       |   27837928 |          1 | 3249 GB    |         7
 js        |    9868169 |          2 | 1130 GB    |        12
 dll       |    3936833 |          3 | 2183 GB    |         8
 ogg       |    3017305 |          4 | 970 GB     |        14
 json      |    2210448 |          5 | 100 GB     |        31
 html      |    1881239 |          6 | 1922 GB    |         9
 mp3       |    1747301 |          7 | 1590 GB    |        10
 svg       |    1300887 |          8 | 53 GB      |        42
 webp      |    1056907 |          9 | 328 GB     |        20
 css       |    1010270 |         10 | 7299 MB    |        84
 jpg       |     960955 |         11 | 277 GB     |        21
 webm      |     902614 |         12 | 411 GB     |        18
 m4a       |     836700 |         13 | 259 GB     |        23
 txt       |     803311 |         14 | 6375 MB    |        92
 ico       |     738685 |         15 | 7260 MB    |        85
 unityweb  |     720666 |         16 | 4947 GB    |         4
 wav       |     600917 |         17 | 812 GB     |        16
 gz        |     598659 |         18 | 5466 GB    |         2
 wasm      |     572796 |         19 | 12 TB      |         1
 xml       |     550423 |         20 | 14 GB      |        65
 br        |     543177 |         21 | 3651 GB    |         6
 import    |     448936 |         22 | 292 MB     |       333
 efkefc    |     439778 |         23 | 6809 MB    |        89
 (none)    |     416460 |         24 | 380 GB     |        19
 rpgmvp    |     292828 |         25 | 88 GB      |        33
 pck       |     249533 |         26 | 4483 GB    |         5
 config    |     244955 |         27 | 4788 MB    |       110
 pak       |     240105 |         28 | 119 GB     |        28
 png_      |     237382 |         29 | 38 GB      |        44
 data      |     219985 |         30 | 5246 GB    |         3
 ttf       |     210797 |         31 | 79 GB      |        35
 ogg_      |     205821 |         32 | 32 GB      |        46
 assets    |     204958 |         33 | 175 GB     |        25
 md5       |     203991 |         34 | 18 MB      |      1065
 bundle    |     170106 |         35 | 213 GB     |        24
 ress      |     169520 |         36 | 1298 GB    |        11
 rpgmvo    |     153565 |         37 | 60 GB      |        39
 map       |     152937 |         38 | 35 GB      |        45
 gif       |     144938 |         39 | 103 GB     |        29
 ctex      |     144145 |         40 | 17 GB      |        58
 cache     |     133280 |         41 | 9787 MB    |        77
 mp4       |     130329 |         42 | 1050 GB    |        13
 info      |     129623 |         43 | 43 GB      |        43
 cfg       |     126652 |         44 | 125 MB     |       488
 stex      |     124685 |         45 | 6672 MB    |        90
 aspx      |     105746 |         46 | 6099 MB    |        93
 mdb       |     104028 |         47 | 2739 MB    |       130
 ks        |     102759 |         48 | 439 MB     |       281
 browser   |     101829 |         49 | 156 MB     |       440
 rpgmvm    |      96910 |         50 | 27 GB      |        50

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions