|
| 1 | +# Rate-distortion data for image compression |
| 2 | + |
| 3 | +Subdirectories contain CSV files with rate-distortion (RD) data for different |
| 4 | +image compression methods. We include data for standard codecs (JPG, J2K, WebP, |
| 5 | +etc.) and many learning-based methods. Quality is measured by PSNR and MS-SSIM. |
| 6 | + |
| 7 | +Note that not all combinations of compression methods, quality metrics, and |
| 8 | +evaluation data sets are covered. |
| 9 | + |
| 10 | +### Table of Contents |
| 11 | + |
| 12 | +* [Image Compression Methods](#image_compression_methods) |
| 13 | +* [Quality Metrics](#quality_metrics) |
| 14 | +* [Data Sets for Evaluation](#data_sets_for_evaluation) |
| 15 | + |
| 16 | +## Image Compression Methods |
| 17 | + |
| 18 | +-------------------------------------------------------------------------------- |
| 19 | + |
| 20 | +### Standard (Hand-Engineered) Codecs |
| 21 | + |
| 22 | +* JPEG (4:2:0) |
| 23 | +* JPEG 2000 ([OpenJPEG](https://www.openjpeg.org) and |
| 24 | + [Kakadu](https://kakadusoftware.com/)) |
| 25 | +* [WebP](https://developers.google.com/speed/webp) |
| 26 | +* [BPG](https://bellard.org/bpg/) (4:4:4 and 4:2:0) |
| 27 | + |
| 28 | +### Learning-based Methods |
| 29 | + |
| 30 | +1. [Context-adaptive Entropy Model for End-to-end Optimized Image Compression] |
| 31 | + (https://openreview.net/forum?id=HyxKIiAqYQ) \ |
| 32 | + Jooyoung Lee, Seunghyun Cho, and Seung-Kwon Beack \ |
| 33 | + Int. Conf. on Learning Representations (ICLR) 2019 |
| 34 | + |
| 35 | +2. [Joint autoregressive and hierarchical priors for learned image |
| 36 | + compression] |
| 37 | + (https://arxiv.org/abs/1809.02736) \ |
| 38 | + David Minnen, Johannes Ballé, and George Toderici \ |
| 39 | + Advances in Neural Information Processing Systems (NeurIPS) 2018 |
| 40 | + |
| 41 | +3. [Learning a Code-Space Predictor by Exploiting Intra-Image-Dependencies] |
| 42 | + (http://bmvc2018.org/contents/papers/0491.pdf) \ |
| 43 | + Jan P. Klopp, Yu-Chiang Frank Wang, Shao-Yi Chien, and Liang-Gee Chen \ |
| 44 | + British Machine Vision Conference (BMVC) 2018 |
| 45 | + |
| 46 | +4. [Variational Image Compression with a Scale Hyperprior] |
| 47 | + (https://arxiv.org/abs/1802.01436) \ |
| 48 | + Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick |
| 49 | + Johnston \ |
| 50 | + Int. Conf. on Learning Representations (ICLR) 2018 |
| 51 | + |
| 52 | +5. [Image-dependent local entropy models for image compression with deep |
| 53 | + networks] |
| 54 | + (https://arxiv.org/abs/1805.12295) \ |
| 55 | + David Minnen, George Toderici, Saurabh Singh, Sung Jin Hwang, and Michele |
| 56 | + Covell \ |
| 57 | + Int. Conf. on Image Processing (ICIP) 2018 |
| 58 | + |
| 59 | +6. [Improved Lossy Image Compression With Priming and Spatially Adaptive Bit |
| 60 | + Rates for Recurrent Networks] |
| 61 | + (https://arxiv.org/abs/1703.10114) \ |
| 62 | + Nick Johnston, Damien Vincent, David Minnen, Michele Covell, Saurabh Singh, |
| 63 | + Troy Chinen, Sung Jin Hwang, Joel Shor, and George Toderici \ |
| 64 | + IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) 2018 |
| 65 | + |
| 66 | +7. [Real-Time Adaptive Image Compression] |
| 67 | + (https://arxiv.org/abs/1705.05823) \ |
| 68 | + Oren Rippel and Lubomir Bourdev \ |
| 69 | + International Conference on Machine Learning (ICML) 2017 |
| 70 | + |
| 71 | +8. [End-to-end Optimized Image Compression] |
| 72 | + (https://arxiv.org/abs/1611.01704) \ |
| 73 | + Johannes Ballé, Valero Laparra, and Eero P. Simoncelli \ |
| 74 | + Int. Conf. on Learning Representations (ICLR) 2017 |
| 75 | + |
| 76 | +9. [Lossy Image Compression with Compressive Autoencoders] |
| 77 | + (https://openreview.net/forum?id=rJiNwv9gg) \ |
| 78 | + Lucas Theis, Wenzhe Shi, Andrew Cunningham, and Ferenc Huszár \ |
| 79 | + Int. Conf. on Learning Representations (ICLR) 2017 |
| 80 | + |
| 81 | +10. [Spatially adaptive image compression using a tiled deep network] |
| 82 | + (https://arxiv.org/abs/1802.02629) \ |
| 83 | + David Minnen, George Toderici, Michele Covell, Troy Chinen, Nick Johnston, |
| 84 | + Joel Shor, Sung Jin Hwang, Damien Vincent, and Saurabh Singh \ |
| 85 | + Int. Conference on Image Processing (ICIP) 2017 |
| 86 | + |
| 87 | +11. [Full Resolution Image Compression with Recurrent Neural Networks] |
| 88 | + (https://arxiv.org/abs/1608.05148) \ |
| 89 | + George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David |
| 90 | + Minnen, Joel Shor, and Michele Covell \ |
| 91 | + IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017 |
| 92 | + |
| 93 | +## Quality Metrics |
| 94 | + |
| 95 | +-------------------------------------------------------------------------------- |
| 96 | + |
| 97 | +### Peak Signal-to-Noise Ratio (PSNR) |
| 98 | + |
| 99 | +According to |
| 100 | +[wikipedia](https://en.wikipedia.org/wiki/Peak_signal-to-noise_ratio): |
| 101 | + |
| 102 | +> Peak signal-to-noise ratio, often abbreviated PSNR, is an engineering term for |
| 103 | +> the ratio between the maximum possible power of a signal and the power of |
| 104 | +> corrupting noise that affects the fidelity of its representation. Because many |
| 105 | +> signals have a very wide dynamic range, PSNR is usually expressed in terms of |
| 106 | +> the logarithmic decibel scale. |
| 107 | +
|
| 108 | +PSNR is commonly used to measure image quality even though its correlation with |
| 109 | +human preferences is rather low (see the [TID 2013 |
| 110 | +study](http://www.ponomarenko.info/tid2013.htm)). You can calculate the PSNR |
| 111 | +between two images using |
| 112 | +[tf.image.psnr()](https://www.tensorflow.org/api_docs/python/tf/image/psnr). |
| 113 | + |
| 114 | +### Multiscale Structural Similarity (MS-SSIM) |
| 115 | + |
| 116 | +Multiscale Structural Similarity (MS-SSIM) is an extension of [structural |
| 117 | +similarity (SSIM)](https://en.wikipedia.org/wiki/Structural_similarity) that |
| 118 | +adds flexibility by measuring similarity at different spatial scales. It was |
| 119 | +developed in 2003 by Wang, Simoncelli, and Bovik |
| 120 | +([PDF](https://www.cns.nyu.edu/pub/eero/wang03b.pdf)). MS-SSIM is typically |
| 121 | +thought to better match human preferences than PSNR although optimizing directly |
| 122 | +for MS-SSIM can lead to objectionable distortion, e.g. blurrier reconstructions |
| 123 | +around text and faces. |
| 124 | + |
| 125 | +You can calculate the MS-SSIM score between two images using |
| 126 | +[tf.image.ssim_multiscale()]( |
| 127 | +https://www.tensorflow.org/api_docs/python/tf/image/ssim_multiscale). Note that |
| 128 | +both SSIM and MS-SSIM have a maximum score of 1.0, and very small quantitative |
| 129 | +differences can imply very large visual differences. For this reason, we often |
| 130 | +graph MS-SSIM as decibels to improve readability using: `ms_ssim_db = -10 * |
| 131 | +log10(1 - ms_ssim)`. |
| 132 | + |
| 133 | +### Colorspaces |
| 134 | + |
| 135 | +Many research papers on learned image compression report image quality results |
| 136 | +(distortion) averaged over the RGB channels. While mathematically valid, this |
| 137 | +approach does not match the sensitivity of the human visual system (e.g. we're |
| 138 | +more sensitive to green than blue) and is **not** in line with common practice |
| 139 | +in the image processing community. |
| 140 | + |
| 141 | +We provide RGB evaluation results to facilitate comparing against older papers, |
| 142 | +but we **strongly recommend** that future papers report results only the |
| 143 | +luminance channel (`Y'` in `Y'CbCr`) or by using a 6:1:1 weighted average over |
| 144 | +`YCbCr`. |
| 145 | + |
| 146 | +## Data Sets for Evaluation |
| 147 | + |
| 148 | +-------------------------------------------------------------------------------- |
| 149 | + |
| 150 | +### Kodak |
| 151 | + |
| 152 | +The Kodak data set is a collection of 24 images with resolution 768x512 (or |
| 153 | +512x768). The images are available as PNG files here: |
| 154 | +[http://r0k.us/graphics/kodak](http://r0k.us/graphics/kodak) |
| 155 | + |
| 156 | + @misc{kodak, |
| 157 | + title="Kodak Lossless True Color Image Suite ({PhotoCD PCD0992})", |
| 158 | + author="Eastman Kodak", |
| 159 | + url = {http://r0k.us/graphics/kodak}, |
| 160 | + } |
| 161 | + |
| 162 | +### Tecnick |
| 163 | + |
| 164 | +The Tecnick data set contains 100 1200x1200 images. It is available for download |
| 165 | +here (511 MB): |
| 166 | +[https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip](https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip) |
| 167 | + |
| 168 | + @inproceedings{tecnick, |
| 169 | + author = {N. Asuni and A. Giachetti}, |
| 170 | + title = {{TESTIMAGES}: A large-scale archive for testing visual devices and basic image processing algorithms {(SAMPLING 1200 RGB set)}}, |
| 171 | + year = {2014}, |
| 172 | + booktitle = {{STAG}: Smart Tools and Apps for Graphics} |
| 173 | + url = {https://sourceforge.net/projects/testimages/files/OLD/OLD_SAMPLING/testimages.zip}, |
| 174 | + } |
0 commit comments