GPU Acceleration of JPEG2000
JPEG2000 is an image compression standard created by the Joint Photographic Experts Group (JPEG). JPEG2000 is aimed at providing not only compression performance superior to the current JPEG standard but also advanced capabilities demanded by applications in the fields such as medical imaging, film industry, or image archiving. It features optional mathematically lossless processing, error resilience, or progressive image transmission by improving pixel accuracy and resolution. On the other hand, the advanced features and the superb compression performance yields higher computational demands which implies slower processing.
Attracted by their raw computing power, a number of general-purpose GPU computing approaches has been implemented in recent years, including GLSL, CUDA, and OpenCL. Because of its flexibility and potential to utilize power of GPUs, we have opted for CUDA—a massively parallel computing architecture designed by NVIDIA. Modern GPU architectures are designed to run thousands of threads in parallel.

Fig. 1: Block diagram of JPEG2000 encoder
The input to EBCOT is transformed using Discrete Wavelet Transform (DWT) and optionally quantized. Image data is then partitioned into so called code-blocks. Each code-block is independently processed by context-modeling and arithmetic MQ-coder modules in Tier-1. The context modeller analyzes the bit structure of a code-block and collects contextual information CX which is passed together with bit values D to the arithmetic coding module for binary compression.
Discrete Wavelet Transform (DWT)
DWT processing time (ms) for single color 8 bit image in full HD resolution (1920×1080).
| GPU | Transformation | Time (ms) (1 level / 3 levels) |
| NVIDIA GTX 580 | Forward DWT 9/7 | 0.2 / 0.35 |
| NVIDIA GTX 580 | Reverse DWT 9/7 | 0.28 / 0.45 |
| NVIDIA GTX 580 | Forward DWT 5/3 | 0.17 / 0.28 |
| NVIDIA GTX 580 | Reverse DWT 5/3 | 0.19 / 0.32 |
Context-Modelling
To be published.
Arithmetic Coding (MQ-Coder)
Performance of CPU and GPU implementations of MQ-Coder. The time presented is in ms. Results are measured on NVIDIA GTX 580 GPU for 8bit RGB images in 720p (1280×720), 1080p (1920×1080), and 4K (4096×2160) resolutions.
| Implementation | 720p | 1080p | 4K |
| OpenJPEG 1.4 | 157 | 316 | 1081 |
| Jasper 1.900.1 | 89 | 178 | 594 |
| Kakadu 6.4 (1t) | 132 | 268 | 921 |
| Kakadu 6.4 (2t) | 67 | 137 | 476 |
| Kakadu 6.4 (4t) | 41 | 84 | 284 |
| CUJ2K 1.1 | 25 | 49 | 166 |
| CUDA GPU | 7.3 | 8.1 | 17.6 |
EBCOT Tier-1
Performance of CPU and GPU implementations of whole EBCOT Tier-1 (Context modeller + MQ-Coder). The time presented is in ms. Results are measured on NVIDIA GTX 580 GPU for 8bit RGB image in 1080p (1920×1080) resolution.
| Implementation | 1080p |
| OpenJPEG 1.4 | 958 |
| Jasper 1.900.1 | 777 |
| Kakadu 6.4 (1t) | |
| Kakadu 6.4 (2t) | |
| Kakadu 6.4 (4t) | 182 |
| CUJ2K 1.1 | 143 |
| CUDA GPU | 20 |
Publications
- MATELA, Jiří – RUSŇÁK, Vít – HOLUB, Petr. Efficient JPEG2000 EBCOT Context Modeling for Massively Parallel Architectures. In Data Compression Conference (DCC), 2011. Snowbird, UT, USA : IEEE Computer Society, 2011. p. 423-432, 10 pp. ISBN 978-0-7695-4352-9.
URL: http://dx.doi.org/10.1109/DCC.2011.49
Bibtex:
@InProceedings{916772,
author = {Matela, Jiří and Rusňák, Vít and Holub, Petr},
title = {{Efficient JPEG2000 EBCOT Context Modeling for Massively Parallel Architectures}},
booktitle = {Data Compression Conference (DCC), 2011},
year = {2011},
month=march,
pages = {423 – 432},
address = {Snowbird, USA},
doi={10.1109/DCC.2011.49},
ISSN={1068-0314}
} - MATELA, Jiří – RUSŇÁK, Vít – HOLUB, Petr. GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000. In Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS’10) – Selected Papers. Dagstuhl, Germany : Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2011. p. 77-84, 8 pp. ISBN 978-3-939897-22-4.
URL: http://dx.doi.org/10.4230/OASIcs.MEMICS.2010.77
Bibtex:
@InProceedings{matela_et_al:OASIcs:2011:3068,
author ={Jiri Matela and Vit Rusnak and Petr Holub},
title ={{GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000}},
booktitle ={Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS’10) — Selected Papers},
pages ={77–84},
series ={OpenAccess Series in Informatics (OASIcs)},
ISBN ={978-3-939897-22-4},
ISSN ={2190-6807},
year ={2011},
volume ={16},
publisher ={Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik},
address ={Dagstuhl, Germany},
doi ={http://dx.doi.org/10.4230/OASIcs.MEMICS.2010.77}
} - MATELA, Jiří. GPU-Based DWT Acceleration for JPEG2000. In Annual Doctoral Workshop on Mathematical and Engineering Methods in Computer Science. Brno : NOVPRESS s.r.o., 2009. p. 136-143, 8 pp. ISBN 978-80-87342-04-6.
URL: http://www.sitola.cz/papers/858064.pdf
Bibtex:
@InProceedings{858064,
author = {Matela, Jiří},
title = {{GPU-Based DWT Acceleration for JPEG2000}},
booktitle = {MEMICS 2009 Proceedings},
year = {2009},
pages = {136-143},
address = {Brno},
isbn = {978-80-87342-04-6}
}




