Applications for fast Image & Video Processing on GPU
We can offer custom solutions for high performance imaging applications in various fields:
To get better understanding about image processing performance in such solutions, please have a look at current benchmarks.
Long-term video recording for cameras with high frame rate
Many modern high speed and high resolution cameras have data rate in the range of 200–2400 MByte/s or even more and it's quite complicated to capture and to save that stream to HDD/SSD. To solve that problem one could do real time JPEG compression on GPU to increase duration time of video recording to 10–20 times. One more benefit of that solution - one can use conventional HDD or SSD instead of RAID. On GPU NVIDIA GeForce GTX 1080 TI one can do JPEG compression of 24-bit 4K image within less than 0.5 ms.
Super fast demosaicing and JPEG compression
Software for machine vision and industrial cameras usually use the following image processing pipeline: read a stream of raw Bayer CFA images, demosaicing, jpeg compression, save each image to disk. We have designed a solution to combine both demosaicing and jpeg compression in our GPU software and got significant speed up. For example, our software on GPU NVIDIA GeForce GTX 1080 needs less than 2 ms to carry out high quality demosaicing and jpeg compression (quality 90% and 4:4:4 subsampling, demosaic algorithm is DFPD, without I/O latency) for input raw 4K image with Bayer CFA.
Raw Bayer Compression for machine vision and 3D applications
To get additional speed up for high performance applications, one can split Raw Bayer image into four color planes and to compress them with JPEG algorithm separately. In this way we exclude debayer from the pipeline and get less data for JPEG compression. Total speed up could be up to 50% in comparison with standard JPEG compression solution for Raw Bayer data.
In that case decompressing and viewing such a data will be more complicated, but this is not a problem for GPU image processing and it could be done completely on GPU as well.
Fastvideo Image & Video Processing SDK on CUDA is a core for many VR applications with 2K/4K/8K resolutions. High performance and excellent quality for image processing is a must for most of VR solutions.
Colour management system according to camera and monitor ICC profiles
It's not enough to create high quality camera for imaging application - one have to design powerful software to be able to render and to visualize images from the camera. Here comes a question about colour management, ICC profiles for camera and monitor, and related matters. To take these things into account one can incorporate Color Management System for image processing and to make it fast we have implemented that on GPU.
High performance server for multiple camera systems
When you need to record data from multiple cameras in real time it could be a good idea to use high quality image processing pipeline on GPU. Some our customers do that to cope with huge amounts of data. They utilize Fastvideo SDK to do full image processing pipeline on GPU in real time from multiple cameras. We can perform realtime image processing and visualizing from two cameras XIMEA xiB with resolution 20 Mpix, 12-bit at 30 fps on one NVIDIA GeForce GTX 1080.
Realtime panorama stitching on CUDA
We have a solution to offer realtime panorama stitching on GPU. All stages of image processing is done on GPU.
Fast CinemaDNG Processor on GPU
We've created Fast CinemaDNG Processor which is capable to do realtime DNG image processing and to play video for DNG/CinemaDNG raw files. Usually this is time-consuming task to transform series of CinemaDNG images into 16-bit TIFF or 8/12-bit JPEG sequence. Now it could be done on NVIDIA GPU really fast. We can also monitor RGB Parade and Histogram and apply wavelet-based denoising in realtime or at video playback.
Fast CinemaDNG Processor can offer fast preview of your raw DNG data directly from Windows Explorer, just click right button on the folder and see the result in player. The software also has excellent trimming capabilities to remove unnecessary frames from the footage. It has super fast lossless JPEG decoder on CPU, which is important issue to achieve smooth video output for DNG player. This is a tool for fast and efficient preprocessing before doing color grading in Blackmagic DaVinci Resolve or Adobe Premiere Pro.
Batch conversion of CR2 and DNG raw images to JPEG/TIFF
Full image processing pipeline for Canon CR2 and Adone DNG raw data could be done very fast on GPU and this is the way to insure high performance conversion to JPG/TIF. Standard pipeline includes raw decoding and preprocessing, WB, demosaicing, denoising, color correction, curves and levels, DCP support, resizing, sharpening, visualizing, etc.
Media & Entertainment, Digital Cinema and JPEG2000
There is wide adoption of JPEG2000 in digital cinema as well as in post production and in the broadcast. J2K solutions have excellent results in quality, though they need a lot of compute power and bandwidth. JPEG2000 performance is not a bottleneck any more. Our GPU-based JPEG2000 codec runs much faster than any CPU-based solution. You just need conventional NVIDIA GPU for laptop, desktop or server. We also have special J2K solution for mobile GPUs Tegra X1 and X2.
Capturing and encoding broadcast quality video from SD, HD and 3G-SDI grabbers
That solution was developed for single or multi-camera environments and provides full image processing pipeline for several HD-SDI video streams on GPU. We support SD-, HD- and 3G-SDI grabbers at 2K/4K/8K resolutions.
Image recompression and resize on GPU for web applications
If you have really big photo hosting or powerful image server, you have to think how to optimize image loading from your database to client's browser. Usually original photos are saved in JPEG format and they don't have the same resolution as you need at HTML page, so you need to do fast image resize before sending the photo. Complete workflow is the following:
We offer solution on GPU for that task and its performance is much better in comparison with standard solutions on CPU. With that approach we can solve one more problem – we can create super fast thumbnail generator, capable to work with really big number of jpeg images. That could be interesting for photo catalogs, web shops, etc. Now for JPEG image with resolution 2000×1500 with quality 90% and subsampling 4:4:4 we need less than ~7–8 ms for resize to resolution 1000×750 on Tesla K10.
JPG image loading and visualizing
Let us consider a situation when we need to load JPG image and visualize it on the monitor. If we do image decompression on CPU first and then send it to GPU, we have the following image processing pipeline:
If we do image decompression on GPU, the pipeline is not the same:
Taking into consideration the facts that JPG decompression is faster on GPU and we need less time to send compressed data to GPU instead of uncompressed (we also have less CPU usage), we see that the case with decompression and visualizing on GPU is faster. This is the way to work with 8K and 16K resolutions.
Here we get benefits four times:
Video conferencing over gigabit network
The idea is quite simple – one can do online JPEG compression on GPU and send compressed data frame by frame to offer video conferencing over gigabit network. For that solution we need two PCs or laptops with NVIDIA GPU.
Image compression and image enhancement in the field of medical imaging
Image compression in the field of medical imaging is very important because medical equipment can generate huge amounts of image data. Therefore image compression is a must for that kind of applications. As an example one can consider JPEG-DICOM and JPEG2000-DICOM converters.
One more interesting application for medical imaging is GPU-accelerated image enchancement technology. NVIDIA GPUs and Fastvideo SDK are utilized in Medical Vision software for flexible video endoscopy. That kind of solution could be used for raw image processing in endoscopy as well.
Our software can do real time screen capture on the host and send video stream to remote PC via network or wireless connection. We compress data on GPU and send it via network, then we do decompression on remote hardware (usually this is a custom board or NVIDIA Tegra GPU) and show image or video on monitor. Main idea is to increase the distance between host and display with minimum latency.
Video Walls on GPU
Video Wall application is basically a PC-based network video server to deliver a variety of content for up to 30-50 total remote displays or even more. Content can include source image or video with resolutions in the rage from 4K to 8К. All image processing is done on NVIDIA GPU in real time.