CAIEC

CAIEC: End-to-End Performance Optimization for AI-Based Scientific Data Compression via Inference-Encoding Co-Design

This repository contains the reference implementation of CAIEC, a high-performance AI-based scientific data compression framework that co-designs model inference and entropy encoding to achieve state-of-the-art end-to-end throughput while preserving strong rate–distortion (RD) performance.

CAIEC significantly accelerates learned scientific data compression, achieving throughput comparable to leading non-AI GPU compressors, while consistently outperforming them in compression quality.

🚀 Key Features

CAIEC introduces system-level optimizations across the entire learned compression pipeline:

1. Scientific Data Preprocessing & Reconstruction

Slice-wise normalization for heterogeneous scientific fields
Three-channel stacking to reuse mature 2D learned image compression models
Block-wise resolution control for memory-efficient GPU inference
Overlapped tiling with weighted overlap-add (OLA) reconstruction to suppress block artifacts

2. Component-Level Mixed-Precision Inference

Fine-grained FP8 / FP16 assignment based on component RD sensitivity
Preserves compression quality while significantly improving inference throughput
Achieves up to 6× inference speedup over FP32 baselines

3. P-Controlled GPU Entropy Coding

Fully GPU-based rANS entropy encoder
Adjustable parallel granularity to balance throughput vs. compression ratio
Avoids CPU–GPU synchronization overhead

📊 Performance Highlights

Evaluated on 8 real-world scientific datasets (CESM, NYX, Hurricane, COVID, etc.) using NVIDIA H100 GPUs:

Compression throughput: up to 14.48 GB/s, average 10.14 GB/s
Decompression throughput: up to 7.43 GB/s, average 5.77 GB/s
Speedup over vanilla CompressAI: up to 93.6× (compression) and 66.9× (decompression)
Compression quality consistently outperforms state-of-the-art non-AI GPU compressors (cuSZ-Hi, cuZFP, PFPL)

🧱 Framework Overview

CAIEC follows an end-to-end learned compression pipeline:

Scientific Data
      ↓
Preprocessing (Normalization, Stacking, Tiling)
      ↓
Encoder (Mixed Precision)
      ↓
Quantization
      ↓
GPU Entropy Coding (P-controlled rANS)
      ↓
Compressed Bitstream

The decompression pipeline mirrors the process with GPU-based entropy decoding and weighted reconstruction.

🔧 Dependencies

Python ≥ 3.9
PyTorch ≥ 2.0
CUDA ≥ 12.0
NVIDIA GPU with Tensor Cores (Ampere or newer recommended)
CompressAI (modified)
TensorRT (optional, for deployment)

Detailed installation instructions will be provided soon.

🧪 Supported Datasets

CAIEC has been evaluated on a wide range of scientific domains:

Climate simulation (CESM)
Cosmology simulation (NYX)
Weather simulation (Hurricane)
Medical imaging (COVID X-ray)
Microscopy (STEM)
Crystallography (SeSAD)
X-ray tomography (Tomobank)
Photography (Tecnick)

📈 Performance Comparison

📊 End-to-End Throughput

The following figures compare end-to-end compression and decompression throughput of CAIEC with representative AI-based and non-AI GPU compressors across multiple real-world scientific datasets.

End-to-end compression and decompression throughput (GB/s) and speedup. Speedup is computed relative to the corresponding CompressAI implementation (denoted as Ori; Ball\'e2016 as B; DCAE as D). CAIEC(B) and AIZ(B) are normalized to Ori(B), while CAIEC(D) is normalized to Ori(D). Our method is highlighted in blue, and the best speedup in each row is marked in orange.

Key observations:

CAIEC achieves 10–20 GB/s compression throughput, reaching the same order of magnitude as state-of-the-art non-AI GPU compressors.
Compared with vanilla CompressAI, CAIEC delivers up to 93.6× compression speedup and 66.9× decompression speedup.

📉 Rate–Distortion (RD) Performance

We compare CAIEC against representative non-AI GPU compressors and existing AI-based approaches in terms of PSNR vs. bitrate (bpp).

Rate--distortion (RD) plots showing PSNR versus bitrate across eight representative datasets. The main panel in each subfigure zooms into the low-bitrate region, while the inset (bottom-right) shows the complete RD curves of all methods over the full bitrate range.

Key observations:

CAIEC consistently achieves higher PSNR at the same bitrate compared to non-AI compressors (cuSZ-Hi, cuZFP, PFPL).
At low bitrates, CAIEC preserves global structures and fine details more effectively.
Mixed-precision inference introduces only minor RD degradation, while enabling substantial throughput gains.

⚖️ Throughput–Compression Ratio Trade-off

CAIEC exposes a tunable parameter P to control the parallel granularity of GPU entropy coding, enabling explicit trade-offs between throughput and compression efficiency.

Throughput--efficiency trade-off under different $P_{\mathrm{ch}}$ settings.

Key observations:

Smaller P improves throughput via higher parallelism, with modest compression-ratio loss.
Larger P improves compression efficiency while maintaining sufficient throughput.
CAIEC selects P such that entropy coding does not become the end-to-end bottleneck.

📌 Status

🚧 This repository is under active development.

Please stay tuned for updates.

🙏 Acknowledgements

This work builds upon:

CompressAI
Prior work on GPU scientific compression (cuSZ, cuZFP, PFPL)
NVIDIA CUDA and TensorRT ecosystem

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github		.github
assets		assets
compressai		compressai
docker		docker
docs		docs
examples		examples
figures		figures
results		results
tests		tests
third_party		third_party
.clang-format		.clang-format
.flake8		.flake8
.git-blame-ignore-revs		.git-blame-ignore-revs
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
Makefile		Makefile
NEWS.md		NEWS.md
Readme.md		Readme.md
dcae-e2e-rd.csv		dcae-e2e-rd.csv
dcae-e2e.csv		dcae-e2e.csv
dcae_failures-rd.csv		dcae_failures-rd.csv
mypy.ini		mypy.ini
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
ruff.toml		ruff.toml
setup.py		setup.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CAIEC

🚀 Key Features

1. Scientific Data Preprocessing & Reconstruction

2. Component-Level Mixed-Precision Inference

3. P-Controlled GPU Entropy Coding

📊 Performance Highlights

🧱 Framework Overview

🔧 Dependencies

🧪 Supported Datasets

📈 Performance Comparison

📊 End-to-End Throughput

📉 Rate–Distortion (RD) Performance

⚖️ Throughput–Compression Ratio Trade-off

📌 Status

🙏 Acknowledgements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

CAIEC

🚀 Key Features

1. Scientific Data Preprocessing & Reconstruction

2. Component-Level Mixed-Precision Inference

3. P-Controlled GPU Entropy Coding

📊 Performance Highlights

🧱 Framework Overview

🔧 Dependencies

🧪 Supported Datasets

📈 Performance Comparison

📊 End-to-End Throughput

📉 Rate–Distortion (RD) Performance

⚖️ Throughput–Compression Ratio Trade-off

📌 Status

🙏 Acknowledgements

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages