GPU Video Compression with H.264/AVC codec

H264 encoder offers high speed compression of video streams on NVIDIA GPU. NVIDIA’s latest generation of GPUs based on the Kepler and Maxwell architectures, contain a hardware-based H.264 video encoder. This encoder, being dedicated H.264 hardware on the GPU chip, does not use the GPU’s graphics engine and can work together with CUDA applications. The hardware is optimized to provide excellent quality at high performance, enabling a wide range of solutions that require video encoding capabilities. Every GPU from NVIDIA’s Kepler family contains one engine (one session), which is independent of the graphics engine. On NVIDIA Quadro GPUs (starting from K5000) one can run up to 8 sessions of the codec. Currently maximum resolution for one session is Full HD (1920×1080) for Kepler and 4K for Maxwell and Pascal.


  • Color RGB24 streams with 4:2:0 and 4:4:4 subsampling
  • Input video formats for encoding: RGB24
  • Baseline, Main and High Profile support
  • I, P and B frames support
  • 4×4 intra partitioning
  • Sub-pel motion estimation
  • High performance, low power
  • Integration with Fastvideo Image & Video Processing SDK for NVIDIA GPUs
  • System requirements: Windows-7/8/10 (64-bit), Linux Ubuntu/CentOS, CUDA-10.2
  • Hardware requirements: NVIDIA Jetson, GeForce, Quadro or Tesla GPUs (Kepler, Maxwell, Pascal, Volta, Turing)

Benchmarks for CUDA H264 video encoder (High Performance Preset)

H264 video encoder on NVIDIA GeForce GTX 980 GPU is capable to compress video sequences with resolution 1920×1080 (24-bit) at frame rate up to 160 fps. For image resolution 1024×768 one could get up to 400 fps for High Performance Preset. For comparison, CPU-based x264 codec can compress the same image sequence with Full HD resolution at frame rate up to 34 fps on Intel(R) Core i5-3330 CPU @ 3.00 GHz 3.20 GHz with "ultrafast" preset.

Options for H264 video encoder

H.264 encoder can be utilized together with image processing pipeline on NVIDIA GPU (CUDA): we can start from raw data which we get from a camera and finish with compressed stream. Both video processing and encoding could be done in parallel almost at the same time because CUDA code can run independently from h264.

We can offer custom software design for the following image/video processing pipelines on CUDA:

  • White balance + Debayer + LUT + h264
  • Image preprocessing + White Balance + Debayer + Denoiser + Color Correction + LUT + Gamma + Crop + Resize + Sharp + OpenGL output + h.264

We can also extend h264 encoder with fast image processing pipeline for camera applications: image acquisition, dark frame subtraction, shading correction, color correction, demosaic, image filters, denoiser, crop, resize, rotation, sharpen, OpenGL output, etc.

Roadmap for further improvements

  • New version of H.264 codec for parallel execution with CUDA applications - done
  • H.264 codec for Jetson Nano, TX1, TX2, AGX Xavier - done
  • H.265 codec integration - done

For any further information concerning GPU h264 encoder please contact us via email or contact form.

Contact Form

This form collects your name and email. Check out our Privacy Policy on how we protect and manage your personal data.