A complete Huffman encoder/decoder implementation in Zig. This project is a deliberate practice exercise to become more familiar with low-level programming and memory management, inspired by the coding challenge found here.
- Full Huffman compression and decompression
- Binary tree serialization/deserialization
- Bit-level I/O operations
- Dynamic buffer management
- Works on any file type (optimized for text)
- Zig 0.15.2 or later
Install Zig by following the instructions from the official Zig website.
zig build./zig-out/bin/huffman --input <file> --output <compressed_file>./zig-out/bin/huffman --decode --input <compressed_file> --output <decompressed_file># Compress a text file
./zig-out/bin/huffman --input book.txt --output book.huff
# Decompress it back
./zig-out/bin/huffman --decode --input book.huff --output book_decoded.txt
# Verify the files are identical
diff book.txt book_decoded.txtHuffman encoding works best on files with non-uniform byte distribution:
| File Type | Typical Compression Ratio |
|---|---|
| Plain text | 1.5x - 2x |
| Source code | 1.5x - 1.8x |
| Already compressed (PDF, JPG, ZIP) | ~1x (no gain) |
Example on a 750KB text file:
Original: 752583 bytes
Encoded: 434232 bytes
Compression ratio: 1.73
- Frequency Analysis: Count byte occurrences in the input file
- Tree Construction: Build a Huffman tree using a priority queue
- Code Generation: Generate variable-length binary codes (frequent bytes get shorter codes)
- Encoding: Replace each byte with its Huffman code
- Serialization: Store the tree and encoded data in the output file
The compressed file format:
[serialized tree][encoded data][padding info (1 byte)]
src/
main.zig - Entry point, CLI handling
tree.zig - Huffman tree and priority queue
encoder.zig - Code generation and encoding
encoded_data.zig - BitReader/BitWriter, decoding
helpers.zig - File I/O, argument parsing
This project was built to practice:
- Manual memory management with allocators
- Bit manipulation and binary I/O
- Tree data structures
- Low-level file operations
- Zig's type system and error handling
Based on the Huffman encoding challenge from Coding Challenges.