|
|
JPEG2000 codec on GPU
GPU JPEG2000 codec from Fastvideo is based on NVIDIA technology. This is full, performance-oriented implementation of JPEG2000. We got fast JPEG2000 compression and decompression on the GPU due to parallel implementation and thorough optimization of JPEG2000 algorithm. Our JPEG2000 encoder on GPU is definitely the fastest on the market.
Key Features of GPU JPEG2000 Codec
- JPEG2000 encoding and decoding for grayscale and color images with arbitrary width and height
- Lossy (wavelet CDF 9/7) and lossless (wavelet CDF 5/3) image compression and decompression
- Bit depth: 8-16 bits per channel (up to 24 bits per channel)
- Color spaces: sRGB, Rec.709, Adobe RGB, ProPhoto RGB, DCI P3, XYZ, Linear
- Number of decomposition levels: 1–12
- Code-block size 16×16, 32×32 or 64×64
- Chroma subsampling modes: 4:4:4, 4:2:2, 4:2:0
- Image quality in the range of 0–100 (non-integer values are allowed)
- Tile support for encoder and decoder
- Rate control option to constrain image compression ratio
- Window mode for JPEG2000 decoder
- Progressions support:
- J2K encoder: LRCP
- J2K decoder: LRCP, RLCP, RPCL, PCRL or CPRL
- Data input: images from HDD/RAID/SSD or CPU/GPU memory
- Data output: final compressed or uncompressed image in HDD/RAID/SSD or CPU/GPU memory
- Modes of operation:
- Single image mode
- Multi-tile mode for very large images (geospatial, digital pathology, etc.)
- Batch mode for better performance
- Multithreaded batch mode (maximum performance)
- Standard set of computations for JPEG2000 compression and decompression on GPU
- GPU JPEG2000 Encoder
- Input data parsing
- Color Transform (ICT/RCT) and DC-level shifting
- 2D DWT (Discrete Wavelet Transform) with CDF 9/7 or 5/3 wavelets
- Quantization
- EBCOT Tier-1 coding (context modeling and arithmetic MQ-Coder)
- PCRD (Post-Compression Rate-Distortion; optional)
- Tier-2 coding (Packets, Layers, Precincts, Tag Trees)
- Output formatting
- GPU JPEG2000 Decoder
- Input parsing
- Packet decoding
- Entropy decoding
- Coefficient bit modeling
- Inverse Quantization
- Inverse DWT
- Inverse Color Transform and DC-level shifting
- Output formatting
- Optimized for the latest NVIDIA GPUs
- Performance is much better than CPU-based JPEG2000 codecs JasPer, OpenJPEG, Kakadu, etc.
- Performance is significantly higher than GPU-based JPEG2000 encoders CUJ2K and GPU JPEG2K
- Optional integration with OpenGL
- Compatibility with FFmpeg library to read/write Motion JPEG2000 streams (FFmpeg is under LGPL v2.1)
- FFmpeg JPEG2000 codec on GPU for Windows, Linux, ARM
- Compatible with 64-bit Windows-10/11, Linux Ubuntu, L4T
We can integrate JPEG2000 codec in your image processing pipeline to perform the whole job completely on GPU. Please check the description of our GPU Image & Video Processing SDK to evaluate what we can do on GPU.
Support 
- Time and performance benchmarks for JPEG2000 encoder and decoder on GPU
- Full technical support up to successful integration
- JPEG 2000 SDK, documentation, sample applications
Roadmap for fast JPEG2000 Codec
- GPU JPEG2000 Encoder - done
- GPU JPEG2000 Decoder - done
- MXF player on GPU - done
- High performance JPEG2000 decoder on GPU for FFmpeg - done
- GPU JPEG2000 transcoder from MXF to H264/H265 for FFmpeg
- JPX format with GMLJP2 markup (JPEG 2000 Part 2) - done
- High performance JPEG2000 codec on GPU for FFmpeg - done
- Fast JPEG2000 Viewer on GPU for geospatial applications with GMLJP2 support - done
- Fast J2K Player on GPU - done
- GPU JPEG2000 performance optimization (multithreaded batch mode for J2K encoding with tiles) - done
- New JPEG2000 benchmarks on the NVIDIA GeForce RTX 4090 GPU - done
- RAW Bayer Codec on GPU - in progress
JPEG2000 encoder performance on the NVIDIA GeForce RTX 4090 at multithreaded batch mode
These are the latest benchmarks on the GeForce RTX 4090:
| JPEG2000 encoding parameters |
Lossy encoding |
Lossless encoding |
| 2K image, 24-bit, cb 32×32 |
2108 fps |
1238 fps |
| 4K image, 24-bit, cb 32×32 |
732 fps |
425 fps |
Here you can get more info about JPEG2000 encoder and decoder benchmarks, utilized images and other parameters.
Supported parameters for GPU-based JPEG2000 encoder
You can download our sample application for JPEG2000 encoding and decoding on GPU here. You can also use J2kEncoderSample and J2KDecoderSample sample applications from the Fastvideo SDK. Below you can find more info about utilized parameters for JPEG2000 encoding and decoding on GPU.
- -i "file name" - input file in BMP/PGM/PPM format
- -if "folder + mask" - folder path and mask for input file in BMP/PGM/PPM formats
- -o "file name" - output file in JP2 format
- -maxWidth "unsigned int" - set maximum image width for multiple file processing
- -maxHeight "unsigned int" - set maximum image height for multiple file processing
- -d "value" - GPU (device) index
- -a "algorithm name" - one of JPEG2000 encoding algorithms (rev - reversible/lossless, irrev - irreversible/lossy)
- -b "value" - batch size (maximum number of simultaneously processed images)
- -c "value" - codeblock size: 16, 32 (default), 64
- -q "value" - quality (in the range 1-100). Specifies losses at quantization stage.
- -l "value" - number of resolution levels (in the range 1-12)
- -cr "value" - compression ratio. Specifies truncation of encoded codeblocks. Using this option enables PCRD algorithm.
- -s "subsampling" - subsampling 444, 422 or 420 for color images. Default is 444.
- -repeat "value" - how many times to repeat encoding for one image
- -repeatTime "value" - time to repeat encoding for all images (in ms)
- -showFrames - if set, will show execution time for each step
- -discard - do not write the output to disk
- -noMCT - disable Multi-Component (YCbCr) transformation
- -noHeader - disable JP2 header generation
- -tileWidth "value" - tile width
- -tileHeight "value" - tile height
- -outputBitdepth "value" - defines output file bit depth. If the parameter is not set, then output bit depth is taken from PPM/PGM file.
- -overwriteSourceBitdepth "value" - overwrite the range of input image values
- -info - time / performance output is on. Default is off.
- -log "file" - enable log file
- -async - enable async mode to overlap processing and file read/write
- -thread "value" - number of processing threads in async mode. Default value is 1.
- -threadR "value" - number of file reader threads in async mode. Default value is 1.
- -threadW "value" - number of file writer threads in async mode. Default value is 1.
Examples of command lines for JPEG2000 encoder and decoder sample applications
Single mode for JPEG2000 encoding and decoding
Encoding: J2kEncoderSample.exe -i "input image" -o "output image" -a "name" -c "value" -l "value" -d "device ID" -q "value" -info
Decoding: J2KDecoderSample.exe -i "input image" -o "output image" -d "device ID" -info
Batch mode for JPEG2000 encoding and decoding
Encoding: J2kEncoderSample.exe -i "input image" -o "output image" -a "name" -c "value" -d "device ID" -q "value" -l "value" -b "value" -repeat "value"
Decoding: J2KDecoderSample.exe -i "input image" -o "output image" -d "device ID" -b "value" -repeat "value"
Multithreaded batch mode for lossy JPEG2000 encoding and decoding
Encoding: J2kEncoderSample.exe -i input.ppm -o output.jp2 -a irrev -c 32 -l 7 -q 90 -d 0 -async -thread 4 -b 2 -threadR 1 -threadW 1 -repeat 1000
Decoding: J2KDecoderSample.exe -i input.jp2 -o output.ppm -d 0 -async -thread 4 -b 2 -threadR 1 -threadW 1 -repeat 1000
|