Applications for fast Image & Video Processing on GPU
We can offer custom solutions for high performance imaging applications in various fields:
To get better understanding about image processing performance in such solutions, please have a look at SDK benchmarks.
Long-term video recording for cameras with high frame rate
Many modern high speed and high resolution cameras have data rate in the range of 200–4500 MByte/s or even more and it's quite complicated to capture and to save that stream to HDD/SSD. To solve that problem one could do real time JPEG compression on GPU to increase duration time of video recording to 10–20 times. One more benefit of that solution - one can use conventional SSD instead of RAID. On GPU NVIDIA GeForce RTX 4090 one can do JPEG compression of 24-bit 65 MPix image within 0.65 ms.
Super fast demosaicing and JPEG compression
Software for machine vision and industrial cameras usually use the following image processing pipeline: read a stream of raw Bayer CFA images, demosaicing, jpeg compression, save each image to disk. We have designed a solution to combine both demosaicing and jpeg compression in our GPU software and got significant speed up. For example, our software on GPU NVIDIA GeForce RTX 4090 needs less than 1 ms to carry out high quality demosaicing and jpeg compression (quality 90% and 4:4:4 subsampling, demosaic algorithm is MG, without I/O latency) for input raw 4K image with Bayer CFA.
Raw Bayer Compression for machine vision and 3D applications
To get additional speed up for high performance applications, one can split Raw Bayer image into four color planes and to compress them with JPEG algorithm separately. In this way we exclude debayer from the pipeline and get less data for JPEG compression. Total speed up could be up to 50% in comparison with standard JPEG compression solution for Raw Bayer data.
In that case decompressing and viewing such a data will be more complicated, but this is not a problem for GPU image processing and it could be done completely on GPU as well.
We are developing high performance GPU-based software for streaming applications:
We can implement full image processing pipeline on GPU to achieve good image quality, high performance and minimum latency.
Fastvideo Image & Video Processing SDK on CUDA is a core for many VR applications with 2K/4K/8K resolutions. High performance and excellent quality for image processing is a must for most of VR solutions.
Color management system according to camera and monitor DCP and ICC profiles
It's not enough to create high quality camera for imaging application - one have to design powerful software to be able to render and to visualize images from the camera. Here comes a question about colour management, DCP and ICC profiles for camera and monitor, and related matters. To take these things into account one can incorporate Color Management System for image processing and to make it fast we have implemented that on GPU.
High performance server for multiple camera systems
When you need to record data from multiple cameras in real time it could be a good idea to use high quality image processing pipeline on GPU. Some our customers do that to cope with huge amounts of data. They utilize Fastvideo SDK to do full image processing pipeline on GPU in real time from multiple cameras. We can perform realtime image processing and visualizing from two cameras XIMEA xiB with resolution 65 Mpix, 10-bit at 70 fps on one NVIDIA GeForce RTX 4090.
GPU RAW Processor
We've created GPU RAW Processor which is capable to do realtime image processing and to play video for RAW files. Usually this is time-consuming task to transform series of RAW images into 16-bit TIFF or 8/24-bit JPEG sequence. Now it could be done on NVIDIA GPU really fast. We can also monitor RGB Parade and Histogram and apply wavelet-based denoising in realtime or at video playback.
GPU RAW Processor can offer fast preview of your RAW frames directly from Windows Explorer, just click right button on the folder and see the result in player. The software also has excellent trimming capabilities to remove unnecessary frames from the footage.
Batch convert of CR2, CR3, NEF, ARW, DNG raw images to JPEG/TIFF/JP2
Full image processing pipeline for Canon CR2, CR3, Nikon NEF and Adobe DNG raw data could be done very fast on GPU and this is the way to ensure high performance conversion to JPG/TIF. Standard pipeline includes raw decoding and preprocessing, WB, demosaicing, denoising, color correction, curves and levels, DCP and LCP support, resizing, sharpening, 3D LUTs, visualizing, etc.
Media & Entertainment, Digital Cinema and JPEG2000
There is wide adoption of JPEG2000 in digital cinema as well as in post production and in the broadcast. J2K solutions have excellent results in quality, though they need a lot of compute power and bandwidth. JPEG2000 performance is not a bottleneck any more. Our GPU-based JPEG2000 codec runs much faster than any CPU-based solution. You just need conventional NVIDIA GPU for laptop, desktop or server. We also have special J2K solutions for mobile GPUs Jetson TX2, NX/AGX Xavier and Orin.
A couple of other important applications are fast MXF converter and MXF player. MXF format is widely used in M&E and Digital Cinema, so our solution could be utilized for realtime MXF reading, writing and transcoding.
JPEG2000 compression according to DCP is also very interesting application. We are doing that on GPU very fast.
We are developing fast GPU-based software for transcoding applications:
We can implement your desired image processing pipeline on GPU to achieve target image quality, bitrate, performance and latency.
Remote collaborative post production for editing, color grading and reviewing
We can capture live feed from HD-SDI or 3G-SDI sources in real time, then write data to local storage and simultaneously copy to NVIDIA GPU for preprocessing and J2K encoding. The software can stream compressed data over commodity internet to a remote PC/server. At the remote post production facility, the stream is received and decoded to make data immediately available for editing or processing. Here you can find more info for that solution.
Capturing, encoding and delivering broadcast quality video from SD, HD and 3G-SDI cameras and grabbers with minimum latency
That solution was developed for single or multi-camera environments and provides full capture, encoding and delivering of multiple HD-SDI video streams with GPU-based processing. We support SD-, HD- and 3G-SDI grabbers at 2K/4K/8K resolutions and more.
FFmpeg codecs and filters on GPU
There is a huge number of applications which are based on FFmpeg. To accelerate FFmpeg, we've implemented GPU-based codecs and filters which could be easily integrated in the existing CPU-based software on FFmpeg.
These are our GPU-based solutions which are fully compatible with FFmpeg:
Image recompression and resize on GPU for web applications
If you have really big photo hosting or powerful image server, you have to think how to optimize image loading from your database to client's browser. Usually original photos are saved in JPEG format and they don't have the same resolution as you need at HTML page, so you need to do fast image resize before sending the photo. Complete workflow is the following:
We offer solution on GPU for that task and its performance is much better in comparison with standard solutions on CPU. With that approach we can solve one more problem – we can create super fast thumbnail generator, capable to work with really big number of jpeg images. That could be interesting for photo catalogs, web shops, etc. Now for JPEG image with resolution 1920×1080 with quality 90% and subsampling 4:4:4 we need around just ~1 ms for resize to resolution 960×540 on Tesla V100. These are our jpeg resize time measurements for Tesla V100.
JPG image loading, decoding and visualizing
Let us consider a situation when we need to load JPG image and visualize it on the monitor. If we do image decoding on CPU first and then send it to GPU, we have the following image processing pipeline:
If we do image decoding on GPU, the pipeline is not the same:
Taking into consideration the facts that JPG decompression is faster on GPU and we need less time to send compressed data to GPU instead of uncompressed (we also have less CPU usage), we see that the case with decoding and visualizing on GPU is faster. This is the way to work with 8K and 16K resolutions.
Here we get benefits four times:
Video conferencing over gigabit network
The idea is quite simple – one can do online JPEG or JPEG2000 compression on GPU and send compressed data frame by frame to offer video conferencing over gigabit network. For that solution we need two PCs or laptops with NVIDIA GPU.
Image compression and image enhancement in the field of medical imaging
Image compression in the field of medical imaging is very important because medical equipment can generate huge amounts of image data. Therefore image compression is a must for that kind of applications. As an example one can consider JPEG-DICOM and JPEG2000-DICOM converters or DICOM Viewers.
We can offer the following high performance codecs for PACS software (Ultrasound, Endoscopy, X-Ray, CT, MR, PET):
One more interesting application for medical imaging is GPU-accelerated image enchancement technology. NVIDIA GPUs and Fastvideo SDK are utilized in Medical Vision software for flexible video endoscopy. That kind of solution could be used for raw image processing in endoscopy as well.
Film scanners and book scanners
Quite a lot of such scanners are based on machine vision cameras with high resolution. Usually they are working at low frame rate at 12-bit mode. That could be ok for book scanners, though for film scanners it's a must to have HDR implemented. It could be done via multi-exposure approach to get intermediate 16-bit frames. Such image processing is quite complicated and we've done several projects in that field.
Our software can do real time screen capture on the host and send video stream to remote PC via network or wireless connection. We compress data on GPU and send it via network, then we do decompression on remote hardware (usually this is a custom board or NVIDIA Tegra GPU) and show image or video on monitor. Main idea is to increase the distance between host and display with minimum latency.
Video Walls on GPU
Video Wall application is basically a PC-based network video server to deliver a variety of content for up to 30-50 total remote displays and even more. Content can include source image or video with resolutions in the rage from 4K to 8К. All image processing is done on NVIDIA GPU in real time.