Skip to content

Commit 24b0cb2

Browse files
minnendcopybara-github
authored andcommitted
Adds directory with rate-distortion results as CSV files.
PiperOrigin-RevId: 296055483 Change-Id: I1ab11c7728ebff278838945a8dfcaaa2b2db87c4
1 parent b996d5c commit 24b0cb2

File tree

102 files changed

+5350
-3
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

102 files changed

+5350
-3
lines changed

README.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -243,10 +243,18 @@ pip uninstall tensorflow-compression
243243
To build packages for Darwin (and potentially other platforms), you can follow
244244
the same steps, but the Docker image should not be necessary.
245245

246+
## Evaluation
247+
248+
We provide evaluation results for several image compression methods in terms of
249+
different metrics in different colorspaces. Please see the
250+
[results subdirectory](https://tensorflow.github.io/compression/results/readme/image_compression/README.md)
251+
for more information.
252+
246253
## Authors
247254

248-
Johannes Ballé (github: [jonycgn](https://github.com/jonycgn)), Sung Jin Hwang
249-
(github: [ssjhv](https://github.com/ssjhv)), and Nick Johnston (github:
250-
[nmjohn](https://github.com/nmjohn))
255+
* Johannes Ballé (github: [jonycgn](https://github.com/jonycgn))
256+
* Sung Jin Hwang (github: [ssjhv](https://github.com/ssjhv))
257+
* Nick Johnston (github: [nmjohn](https://github.com/nmjohn))
258+
* David Minnen (github: [minnend](https://github.com/minnend))
251259

252260
Note that this is not an officially supported Google product.

results/image_compression/README.md

Lines changed: 174 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,174 @@
1+
# Rate-distortion data for image compression
2+
3+
Subdirectories contain CSV files with rate-distortion (RD) data for different
4+
image compression methods. We include data for standard codecs (JPG, J2K, WebP,
5+
etc.) and many learning-based methods. Quality is measured by PSNR and MS-SSIM.
6+
7+
Note that not all combinations of compression methods, quality metrics, and
8+
evaluation data sets are covered.
9+
10+
### Table of Contents
11+
12+
* [Image Compression Methods](#image_compression_methods)
13+
* [Quality Metrics](#quality_metrics)
14+
* [Data Sets for Evaluation](#data_sets_for_evaluation)
15+
16+
## Image Compression Methods
17+
18+
--------------------------------------------------------------------------------
19+
20+
### Standard (Hand-Engineered) Codecs
21+
22+
* JPEG (4:2:0)
23+
* JPEG 2000 ([OpenJPEG](https://www.openjpeg.org) and
24+
[Kakadu](https://kakadusoftware.com/))
25+
* [WebP](https://developers.google.com/speed/webp)
26+
* [BPG](https://bellard.org/bpg/) (4:4:4 and 4:2:0)
27+
28+
### Learning-based Methods
29+
30+
1. [Context-adaptive Entropy Model for End-to-end Optimized Image Compression]
31+
(https://openreview.net/forum?id=HyxKIiAqYQ) \
32+
Jooyoung Lee, Seunghyun Cho, and Seung-Kwon Beack \
33+
Int. Conf. on Learning Representations (ICLR) 2019
34+
35+
2. [Joint autoregressive and hierarchical priors for learned image
36+
compression]
37+
(https://arxiv.org/abs/1809.02736) \
38+
David Minnen, Johannes Ballé, and George Toderici \
39+
Advances in Neural Information Processing Systems (NeurIPS) 2018
40+
41+
3. [Learning a Code-Space Predictor by Exploiting Intra-Image-Dependencies]
42+
(http://bmvc2018.org/contents/papers/0491.pdf) \
43+
Jan P. Klopp, Yu-Chiang Frank Wang, Shao-Yi Chien, and Liang-Gee Chen \
44+
British Machine Vision Conference (BMVC) 2018
45+
46+
4. [Variational Image Compression with a Scale Hyperprior]
47+
(https://arxiv.org/abs/1802.01436) \
48+
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick
49+
Johnston \
50+
Int. Conf. on Learning Representations (ICLR) 2018
51+
52+
5. [Image-dependent local entropy models for image compression with deep
53+
networks]
54+
(https://arxiv.org/abs/1805.12295) \
55+
David Minnen, George Toderici, Saurabh Singh, Sung Jin Hwang, and Michele
56+
Covell \
57+
Int. Conf. on Image Processing (ICIP) 2018
58+
59+
6. [Improved Lossy Image Compression With Priming and Spatially Adaptive Bit
60+
Rates for Recurrent Networks]
61+
(https://arxiv.org/abs/1703.10114) \
62+
Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh,
63+
Troy Chinen, Sung Jin Hwang, Joel Shor, and George Toderici \
64+
IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2018
65+
66+
7. [Real-Time Adaptive Image Compression]
67+
(https://arxiv.org/abs/1705.05823) \
68+
Oren Rippel and Lubomir Bourdev \
69+
International Conference on Machine Learning (ICML) 2017
70+
71+
8. [End-to-end Optimized Image Compression]
72+
(https://arxiv.org/abs/1611.01704) \
73+
Johannes Ballé, Valero Laparra, and Eero P. Simoncelli \
74+
Int. Conf. on Learning Representations (ICLR) 2017
75+
76+
9. [Lossy Image Compression with Compressive Autoencoders]
77+
(https://openreview.net/forum?id=rJiNwv9gg) \
78+
Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár \
79+
Int. Conf. on Learning Representations (ICLR) 2017
80+
81+
10. [Spatially adaptive image compression using a tiled deep network]
82+
(https://arxiv.org/abs/1802.02629) \
83+
David Minnen, George Toderici, Michele Covell, Troy Chinen, Nick Johnston,
84+
Joel Shor, Sung Jin Hwang, Damien Vincent, and Saurabh Singh \
85+
Int. Conference on Image Processing (ICIP) 2017
86+
87+
11. [Full Resolution Image Compression with Recurrent Neural Networks]
88+
(https://arxiv.org/abs/1608.05148) \
89+
George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David
90+
Minnen, Joel Shor, and Michele Covell \
91+
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017
92+
93+
## Quality Metrics
94+
95+
--------------------------------------------------------------------------------
96+
97+
### Peak Signal-to-Noise Ratio (PSNR)
98+
99+
According to
100+
[wikipedia](https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio):
101+
102+
> Peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for
103+
> the ratio between the maximum possible power of a signal and the power of
104+
> corrupting noise that affects the fidelity of its representation. Because many
105+
> signals have a very wide dynamic range, PSNR is usually expressed in terms of
106+
> the logarithmic decibel scale.
107+
108+
PSNR is commonly used to measure image quality even though its correlation with
109+
human preferences is rather low (see the [TID 2013
110+
study](http://www.ponomarenko.info/tid2013.htm)). You can calculate the PSNR
111+
between two images using
112+
[tf.image.psnr()](https://www.tensorflow.org/api_docs/python/tf/image/psnr).
113+
114+
### Multiscale Structural Similarity (MS-SSIM)
115+
116+
Multiscale Structural Similarity (MS-SSIM) is an extension of [structural
117+
similarity (SSIM)](https://en.wikipedia.org/wiki/Structural_similarity) that
118+
adds flexibility by measuring similarity at different spatial scales. It was
119+
developed in 2003 by Wang, Simoncelli, and Bovik
120+
([PDF](https://www.cns.nyu.edu/pub/eero/wang03b.pdf)). MS-SSIM is typically
121+
thought to better match human preferences than PSNR although optimizing directly
122+
for MS-SSIM can lead to objectionable distortion, e.g. blurrier reconstructions
123+
around text and faces.
124+
125+
You can calculate the MS-SSIM score between two images using
126+
[tf.image.ssim_multiscale()](
127+
https://www.tensorflow.org/api_docs/python/tf/image/ssim_multiscale). Note that
128+
both SSIM and MS-SSIM have a maximum score of 1.0, and very small quantitative
129+
differences can imply very large visual differences. For this reason, we often
130+
graph MS-SSIM as decibels to improve readability using: `ms_ssim_db = -10 *
131+
log10(1 - ms_ssim)`.
132+
133+
### Colorspaces
134+
135+
Many research papers on learned image compression report image quality results
136+
(distortion) averaged over the RGB channels. While mathematically valid, this
137+
approach does not match the sensitivity of the human visual system (e.g. we're
138+
more sensitive to green than blue) and is **not** in line with common practice
139+
in the image processing community.
140+
141+
We provide RGB evaluation results to facilitate comparing against older papers,
142+
but we **strongly recommend** that future papers report results only the
143+
luminance channel (`Y'` in `Y'CbCr`) or by using a 6:1:1 weighted average over
144+
`YCbCr`.
145+
146+
## Data Sets for Evaluation
147+
148+
--------------------------------------------------------------------------------
149+
150+
### Kodak
151+
152+
The Kodak data set is a collection of 24 images with resolution 768x512 (or
153+
512x768). The images are available as PNG files here:
154+
[http://r0k.us/graphics/kodak](http://r0k.us/graphics/kodak)
155+
156+
@misc{kodak,
157+
title="Kodak Lossless True Color Image Suite ({PhotoCD PCD0992})",
158+
author="Eastman Kodak",
159+
url = {http://r0k.us/graphics/kodak},
160+
}
161+
162+
### Tecnick
163+
164+
The Tecnick data set contains 100 1200x1200 images. It is available for download
165+
here (511 MB):
166+
[https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip](https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip)
167+
168+
@inproceedings{tecnick,
169+
author = {N. Asuni and A. Giachetti},
170+
title = {{TESTIMAGES}: A large-scale archive for testing visual devices and basic image processing algorithms {(SAMPLING 1200 RGB set)}},
171+
year = {2014},
172+
booktitle = {{STAG}: Smart Tools and Apps for Graphics}
173+
url = {https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip},
174+
}
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Aggregate rate-distortion data for "Ballé 2017 (ICLR)" on kodak.
2+
# The first column contains bits per pixel (bpp) values.
3+
# The second column contains MS-SSIM/sRGB/R'G'B' values.
4+
#
5+
# Notes:
6+
# 1. Aggregate values were calculated by averaging over a constant
7+
# lambda value.
8+
# 2. We often graph MS-SSIM values in dB for visual clarity using:
9+
# ms_ssim_db = -10 * log10(1 - ms_ssim).
10+
#
11+
# If you have questions or corrections, please contact:
12+
# David Minnen ([email protected]) or George Toderici ([email protected]).
13+
14+
0.119752, 0.903700
15+
0.194591, 0.931041
16+
0.316000, 0.954783
17+
0.481060, 0.969139
18+
0.721303, 0.980815
19+
1.060841, 0.986755
20+
1.458681, 0.992090
21+
1.957564, 0.994965
Lines changed: 21 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,21 @@
1+
# Aggregate rate-distortion data for "Ballé 2018 (ICLR)" on kodak.
2+
# The first column contains bits per pixel (bpp) values.
3+
# The second column contains MS-SSIM/sRGB/R'G'B' values.
4+
#
5+
# Notes:
6+
# 1. Aggregate values were calculated by averaging over a constant
7+
# lambda value.
8+
# 2. We often graph MS-SSIM values in dB for visual clarity using:
9+
# ms_ssim_db = -10 * log10(1 - ms_ssim).
10+
#
11+
# If you have questions or corrections, please contact:
12+
# David Minnen ([email protected]) or George Toderici ([email protected]).
13+
14+
0.115239, 0.907527
15+
0.185698, 0.936307
16+
0.301804, 0.958691
17+
0.468972, 0.972416
18+
0.686378, 0.982478
19+
0.966864, 0.988344
20+
1.307441, 0.992647
21+
1.727503, 0.995267
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Aggregate rate-distortion data for "BPG (4:2:0)" on kodak.
2+
# The first column contains bits per pixel (bpp) values.
3+
# The second column contains MS-SSIM/sRGB/R'G'B' values.
4+
#
5+
# Notes:
6+
# 1. Aggregate values were calculated by averaging over a constant QP value.
7+
# 2. We often graph MS-SSIM values in dB for visual clarity using:
8+
# ms_ssim_db = -10 * log10(1 - ms_ssim).
9+
#
10+
# If you have questions or corrections, please contact:
11+
# David Minnen ([email protected]) or George Toderici ([email protected]).
12+
13+
0.023778, 0.784989
14+
0.028261, 0.800720
15+
0.034131, 0.816491
16+
0.041103, 0.832549
17+
0.049042, 0.846330
18+
0.058963, 0.860314
19+
0.070547, 0.872750
20+
0.083888, 0.885062
21+
0.100428, 0.896837
22+
0.117291, 0.906517
23+
0.138370, 0.916372
24+
0.161282, 0.924601
25+
0.189494, 0.933334
26+
0.219051, 0.939601
27+
0.256016, 0.947138
28+
0.294495, 0.952187
29+
0.338370, 0.957859
30+
0.385684, 0.961779
31+
0.440687, 0.966350
32+
0.500803, 0.970688
33+
0.569223, 0.974397
34+
0.642459, 0.977489
35+
0.711308, 0.979273
36+
0.795128, 0.981873
37+
0.884535, 0.983938
38+
0.977693, 0.985811
39+
1.083310, 0.987469
40+
1.193119, 0.989031
41+
1.302428, 0.990242
42+
1.425981, 0.991342
43+
1.558065, 0.992281
44+
1.699156, 0.993105
45+
1.862072, 0.993914
46+
2.036448, 0.994620
47+
2.211966, 0.995205
48+
2.412092, 0.995756
49+
2.613790, 0.996303
50+
2.838744, 0.996711
51+
3.068833, 0.997057
52+
3.303337, 0.997373
53+
3.539684, 0.997648
54+
3.787450, 0.997908
55+
4.045572, 0.998128
56+
4.317412, 0.998322
57+
4.674708, 0.998600
58+
4.988948, 0.998819
59+
5.355372, 0.999060
60+
5.604246, 0.999147
61+
5.786789, 0.999190
62+
5.974362, 0.999228
63+
6.206643, 0.999251
64+
6.442166, 0.999261
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# Aggregate rate-distortion data for "BPG (4:4:4)" on kodak.
2+
# The first column contains bits per pixel (bpp) values.
3+
# The second column contains MS-SSIM/sRGB/R'G'B' values.
4+
#
5+
# Notes:
6+
# 1. Aggregate values were calculated by averaging over a constant QP value.
7+
# 2. We often graph MS-SSIM values in dB for visual clarity using:
8+
# ms_ssim_db = -10 * log10(1 - ms_ssim).
9+
#
10+
# If you have questions or corrections, please contact:
11+
# David Minnen ([email protected]) or George Toderici ([email protected]).
12+
13+
0.023857, 0.783939
14+
0.028540, 0.800621
15+
0.034282, 0.815880
16+
0.041183, 0.830978
17+
0.049301, 0.845919
18+
0.058861, 0.859465
19+
0.070444, 0.871920
20+
0.084175, 0.884832
21+
0.100255, 0.896600
22+
0.119211, 0.908020
23+
0.140076, 0.917528
24+
0.165529, 0.926932
25+
0.193939, 0.935368
26+
0.226882, 0.943184
27+
0.264902, 0.950550
28+
0.308004, 0.956814
29+
0.353821, 0.962287
30+
0.406663, 0.967124
31+
0.465174, 0.971309
32+
0.528513, 0.975060
33+
0.602615, 0.978421
34+
0.681227, 0.981216
35+
0.763150, 0.983577
36+
0.855122, 0.985761
37+
0.954195, 0.987516
38+
1.058729, 0.989023
39+
1.178031, 0.990455
40+
1.302895, 0.991701
41+
1.427853, 0.992658
42+
1.570484, 0.993543
43+
1.724310, 0.994321
44+
1.890798, 0.994994
45+
2.085680, 0.995657
46+
2.296926, 0.996264
47+
2.513738, 0.996777
48+
2.762745, 0.997260
49+
3.021654, 0.997676
50+
3.311613, 0.998029
51+
3.616902, 0.998321
52+
3.943210, 0.998584
53+
4.282488, 0.998796
54+
4.657569, 0.998987
55+
5.069882, 0.999142
56+
5.543335, 0.999276
57+
6.176737, 0.999429
58+
6.804908, 0.999547
59+
7.545595, 0.999672
60+
8.045927, 0.999713
61+
8.453601, 0.999736
62+
8.895050, 0.999755
63+
9.349519, 0.999765
64+
9.810592, 0.999770

0 commit comments

Comments
 (0)