Fast Discrete Wavelet Transform on CUDA

2D Discrete Wavelet Transform is widely used in image processing applications. To solve the task of Fast Discrete Wavelet Transform, we have developed DWT kernel that runs on existing CUDA hardware from NVIDIA. We have implemented both lifting and convolution-based algorithms and achieved very high performance both for forward and inverse DWT.

The simplest wavelet is Haar and we've developed both 1D and 2D Haar transforms. We have also implemented Cohen–Daubechies–Feauveau 5/3 and 9/7 (CDF 5/3 and CDF 9/7) wavelet transforms. These are biorthogonal wavelets, used in JPEG2000 image compression, wavelet denoising, image classification and many other applications.

Hardware and software

  • CPU Intel Core i7-5930K (Haswell-E, 6 cores, 3.5–3.7 GHz)
  • GPU NVIDIA GeForce GTX 980 (Maxwell, 16 SMM, 2048 cores, 1.1–1.3 GHz)
  • OS Windows 7 SP1 (x64)
  • CUDA Toolkit 7.5

Benchmarks for 2D Discrete Wavelet Transform on CUDA

Image: width 5180, height 5180, 24-bit, 1/2/3-level 2D DWT, symmetrical boundary conditions
Test description: all data in GPU memory, timing includes GPU computations only

  • Forward/Inverse CDF 9/7 – 6 ms (16 GByte/s) for one-level 2D DWT
  • Forward/Inverse CDF 9/7 – 9 ms (11 GByte/s) for two-level 2D DWT
  • Forward/Inverse CDF 9/7 – 12 ms (8 GByte/s) for three-level 2D DWT

We have designed that software as a part of our CUDA image processing SDK. Now our customers have opportunity to use fast 2D DWT on CUDA in their applications.

     Home                   Contacts                 Site Map
GPU Image Processing