cz en

Multimedia distribution and processing

GPU Acceleration of JPEG2000

GPU Acceleration of JPEG2000

The aim of the project is GPU accelerated implementation of JPEG2000 for purposes of real-time high-definition video compression or fast compression of large scale images.

JPEG2000 is an image compression standard created by the Joint Photographic Experts Group (JPEG). JPEG2000 is aimed at providing not only compression performance superior to the current JPEG standard but also advanced capabilities demanded by applications in the fields such as medical imaging, film industry, or image archiving. It features optional mathematically lossless processing, error resilience, or progressive image transmission by improving pixel accuracy and resolution. On the other hand, the advanced features and the superb compression performance yields higher computational demands which implies slower processing.

Attracted by their raw computing power, a number of general-purpose GPU computing approaches has been implemented in recent years, including GLSL, CUDA, and OpenCL. Because of its flexibility and potential to utilize power of GPUs, we have opted for CUDA—a massively parallel computing architecture designed by NVIDIA. Modern GPU architectures are designed to run thousands of threads in parallel.

Block diagram of JPEG2000 encoder.

Fig. 1: Block diagram of JPEG2000 encoder


The input to EBCOT is transformed using Discrete Wavelet Transform (DWT) and optionally quantized. Image data is then partitioned into so called code-blocks. Each code-block is independently processed by context-modeling and arithmetic MQ-coder modules in Tier-1. The context modeller analyzes the bit structure of a code-block and collects contextual information CX which is passed together with bit values D to the arithmetic coding module for binary compression.

Discrete Wavelet Transform (DWT)

DWT processing time (ms) for single color 8 bit image in full HD resolution (1920×1080).

DWT processing time (ms).
GPU Transformation Time (ms) (1 level / 3 levels)
NVIDIA GTX 580 Forward DWT 9/7 0.2 / 0.35
NVIDIA GTX 580 Reverse DWT 9/7 0.28 / 0.45
NVIDIA GTX 580 Forward DWT 5/3 0.17 / 0.28
NVIDIA GTX 580 Reverse DWT 5/3 0.19 / 0.32

Context-Modelling

To be published.

Arithmetic Coding (MQ-Coder)

Performance of CPU and GPU implementations of MQ-Coder. The time presented is in ms. Results are measured on NVIDIA GTX 580 GPU for 8bit RGB images in 720p (1280×720), 1080p (1920×1080), and 4K (4096×2160) resolutions.

MQ-Coder processing time (ms).
Implementation 720p 1080p 4K
OpenJPEG 1.4 157 316 1081
Jasper 1.900.1 89 178 594
Kakadu 6.4 (1t) 132 268 921
Kakadu 6.4 (2t) 67 137 476
Kakadu 6.4 (4t) 41 84 284
CUJ2K 1.1 25 49 166
CUDA GPU 7.3 8.1 17.6

EBCOT Tier-1

Performance of CPU and GPU implementations of whole EBCOT Tier-1 (Context modeller + MQ-Coder). The time presented is in ms. Results are measured on NVIDIA GTX 580 GPU for 8bit RGB image in 1080p (1920×1080) resolution.

Implementation 1080p
OpenJPEG 1.4 958
Jasper 1.900.1 777
Kakadu 6.4 (1t)
Kakadu 6.4 (2t)
Kakadu 6.4 (4t) 182
CUJ2K 1.1 143
CUDA GPU 20

Publications

  • MATELA, Jiří – RUSŇÁK, Vít – HOLUB, Petr. Efficient JPEG2000 EBCOT Context Modeling for Massively Parallel Architectures. In Data Compression Conference (DCC), 2011. Snowbird, UT, USA : IEEE Computer Society, 2011. p. 423-432, 10 pp. ISBN 978-0-7695-4352-9.
    URL: http://dx.doi.org/10.1109/DCC.2011.49
    Bibtex:
    @InProceedings{916772,
    author = {Matela, Jiří and Rusňák, Vít and Holub, Petr},
    title = {{Efficient JPEG2000 EBCOT Context Modeling for Massively Parallel Architectures}},
    booktitle = {Data Compression Conference (DCC), 2011},
    year = {2011},
    month=march,
    pages = {423 – 432},
    address = {Snowbird, USA},
    doi={10.1109/DCC.2011.49},
    ISSN={1068-0314}
    }

  • MATELA, Jiří – RUSŇÁK, Vít – HOLUB, Petr. GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000. In Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS’10) – Selected Papers. Dagstuhl, Germany : Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik, 2011. p. 77-84, 8 pp. ISBN 978-3-939897-22-4.
    URL: http://dx.doi.org/10.4230/OASIcs.MEMICS.2010.77
    Bibtex:
    @InProceedings{matela_et_al:OASIcs:2011:3068,
    author ={Jiri Matela and Vit Rusnak and Petr Holub},
    title ={{GPU-Based Sample-Parallel Context Modeling for EBCOT in JPEG2000}},
    booktitle ={Sixth Doctoral Workshop on Mathematical and Engineering Methods in Computer Science (MEMICS’10) — Selected Papers},
    pages ={77–84},
    series ={OpenAccess Series in Informatics (OASIcs)},
    ISBN ={978-3-939897-22-4},
    ISSN ={2190-6807},
    year ={2011},
    volume ={16},
    publisher ={Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik},
    address ={Dagstuhl, Germany},
    doi ={http://dx.doi.org/10.4230/OASIcs.MEMICS.2010.77}
    }

  • MATELA, Jiří. GPU-Based DWT Acceleration for JPEG2000. In Annual Doctoral Workshop on Mathematical and Engineering Methods in Computer Science. Brno : NOVPRESS s.r.o., 2009. p. 136-143, 8 pp. ISBN 978-80-87342-04-6.
    URL: http://www.sitola.cz/papers/858064.pdf
    Bibtex:
    @InProceedings{858064,
    author = {Matela, Jiří},
    title = {{GPU-Based DWT Acceleration for JPEG2000}},
    booktitle = {MEMICS 2009 Proceedings},
    year = {2009},
    pages = {136-143},
    address = {Brno},
    isbn = {978-80-87342-04-6}
    }
(c) 2008 SITOLA, Laboratoř pokročilých síťových technologií - Login