Fast Gaussian blur in real time on CUDA

Gaussian filtering is widely used standard algorithm which is a must in many applications, starting from Sharp/USM to SIFT/SURF. Gauss filter is isotropic and separable. These properties are very important for fast and efficient image processing. Gaussian filtering usually is time-consuming task, that's why it's a good idea try to accelerate it with CUDA.

Standard features for Gaussian filtering on CUDA

  • Data input: 8/16 or 24/48-bit images in CPU or GPU memory
  • Data output: final image in CPU or GPU memory
  • Parameters: sigma (radius blur)
  • Optimized for the latest NVIDIA GPUs
  • Compatible with Windows-7/8/10 and Linux (32/64)

Benchmarks for Gaussian blur on GeForce GTX 980 (Windows-7 and CUDA-7.5, 32-bit)

Now we need just ~8 ms for Gaussian blur (sigma ~ 1, window 5×5) of 24-bit color image with 3840×2160 resolution. These are benchmarks for 2K / 4K / 20 Mpix images, 24-bit (computations on GPU, without DeviceIO latency)

  • Full HD (2K, 1920×1080) ~ 2.4 GByte/s
  • 4K (3840×2160) ~ 3 GByte/s
     Home                   Contacts                 Site Map
GPU Image Processing