nvJPEG2000 Documentation¶
Introduction¶
The nvJPEG2000 library accelerates the decoding of JPEG 2000 images on NVIDIA GPUs. The library is built on the CUDA platform and is supported on Pascal+ GPU architectures.
Note
Throughout this document, the terms “CPU” and “Host” are used synonymously. Similarly, the terms “GPU” and “Device” are synonymous.
nvJPEG2000 Decoder¶
The library utilizes both CPU and GPU for decoding. Tier 2 decode stage(First stage of decode. Refer to the JPEG 2000 specification for details.) is run on the CPU. All other stages of the decoding process are offloaded to the GPU.
The nvJPEG2000 decoder supports the following subset of JPEG 2000 Part 1:
Up to 16 bits per component
No of components (channels): 1 or 3
Reversible(5-3) and Irreversible(9-7) wavelet transforms
Multiple tiles per image. Each tile can have a single tile partition
Single quality layer per image
All the component dimensions should be the same (444 subsampling only)
LRCP (Layer Resolution Component Position) and RLCP progression orders are supported
Image and tile start offsets should be 0
jp2 file format and jpeg2000 codestream are supported
Prerequisites¶
CUDA Toolkit version 11.0 and above
CUDA Driver version r450 and above
Platforms Supported¶
Linux versions:
Architecture |
Distribution Information |
||||
|---|---|---|---|---|---|
Name |
Version |
Kernel |
GCC |
GLIBC |
|
x86_64 |
RHEL/CentOS |
7.8 |
3.10.0 |
4.8.5 |
2.17 |
8.2 |
4.18 |
8.3.1 |
2.28 |
||
Ubuntu |
20.04.1 |
5.4.0 |
9.3.0 |
2.32 |
|
18.04.5 |
4.15.0 |
8.2.0 |
2.27 |
||
16.04.7 |
4.5.0 |
5.4.0 |
2.23 |
||
OpenSUSE Leap |
15.2 |
4.12.14 |
7.4.0 |
2.26-lp151 |
|
SUSE SLES |
15.2 |
4.12.14 |
7.3.1 |
2.26 |
|
-
Windows versions:
Windows 10 and Windows Server 2019
Support added from version 0.1.0 onwards
Thread Safety¶
Not all nvJPEG2000 types are thread safe. The following should be instantiated separately for each thread: nvjpeg2kDecodeState_t and nvjpeg2kStream_t
Quick Start Guide¶
This section will explain how to use decoder API in a few quick steps. The API details will be covered in the next section.
JPEG2000 Decode¶
The library expects the bitstream to be on host memory and the decoded output will be written to device memory.
Note
nvJPEG2000 Decode sample can be found under this repository: https://github.com/NVIDIA/CUDALibrarySamples/tree/master/nvJPEG2000
The first step is to initialize the handles used in the decoder workflow:
nvjpeg2kStream_t- is used to parse the bitstream and store the bitstream metadatanvjpeg2kHandle_t- is the nvjpeg2k library handlenvjpeg2kDecodeState_t- is used to store the work buffers required for decode
nvjpeg2kHandle_t nvjpeg2k_handle;
nvjpeg2kStream_t nvjpeg2k_stream;
nvjpeg2kDecodeState_t decode_state;
nvjpeg2kCreateSimple(&nvjpeg2k_handle);
nvjpeg2kDecodeStateCreate(&decode_state);
nvjpeg2kStreamCreate(&nvjpeg2k_stream);
Read the jpeg2k bitstream file and store it on the host buffer. The library supports the .jp2 files and jpeg2000 codestreams.
Use the nvjpeg2kStreamParse API to parse the bitstream.
size_t length;
unsigned char *bitstream_buffer; // host or pinned memory
// read the bitstream and store it in bitstream_buffer;
// content of bitstream buffer should not be overwritten until the decoding is complete
nvjpeg2kStreamParse(nvjpeg2k_handle, bitstream_buffer, length, 0, 0, nvjpeg2k_stream);
4. Extract the image dimensions for each component and allocate output memory on the device as shown in the below snippet. The decoded output is stored in nvjpeg2kImage_t. This data structure has the ability to handle data 8 and 16 bit precision outputs. The below snippet demonstrates initialization of nvjpeg2kImage_t for an 8 bit 3 channel image.
#define NUM_COMPONENTS 3
// extract image info
nvjpeg2kImageInfo_t image_info;
nvjpeg2kStreamGetImageInfo(nvjpeg2k_stream, &image_info);
// assuming the decoding of images with 8 bit precision, and 3 components
nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];
for (int c = 0; c < image_info.num_components; c++)
{
nvjpeg2kStreamGetImageComponentInfo(nvjpeg2k_stream, &image_comp_info[c], c);
}
unsigned char *decode_output[NUM_COMPONENTS];
size_t*pitch_in_bytes[NUM_COMPONENTS];
nvjpeg2kImage_t output_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
cudaMallocPitch(&decode_output[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height );
}
output_image.pixel_data = decode_output;
output_image.pixel_type = NVJPEG2K_UINT8;
output_image.pitch_in_bytes = pitch_in_bytes;
Call nvjpeg2kDecode(). One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls.
cudaDeviceSynchronize()is required to complete the decoding process sincenvjpeg2kDecodeis asychronous with respect to the host.
nvjpeg2kDecode(nvjpeg2k_handle, decode_state, nvjpeg2k_stream, &output_image, 0); // 0 corresponds to cudaStream_t
cudaDeviceSynchronize()
Go to step 2 to decode another image. Once the decoding is completed release the nvjpeg2k library resources by calling the corresponding destroy APIs.
Type Declarations¶
API Return Status Codes¶
The return codes of the nvJPEG2000 APIs are listed below:
typedef enum
{
NVJPEG2K_STATUS_SUCCESS = 0,
NVJPEG2K_STATUS_NOT_INITIALIZED = 1,
NVJPEG2K_STATUS_INVALID_PARAMETER = 2,
NVJPEG2K_STATUS_BAD_JPEG = 3,
NVJPEG2K_STATUS_JPEG_NOT_SUPPORTED = 4,
NVJPEG2K_STATUS_ALLOCATOR_FAILURE = 5,
NVJPEG2K_STATUS_EXECUTION_FAILED = 6,
NVJPEG2K_STATUS_ARCH_MISMATCH = 7,
NVJPEG2K_STATUS_INTERNAL_ERROR = 8,
NVJPEG2K_STATUS_IMPLEMENTATION_NOT_SUPPORTED = 9,
} nvjpeg2kStatus_t;
Description the Return Codes
Return Code |
Description |
|---|---|
NVJPEG2K_STATUS_SUCCESS (0) |
The API call has finished successfully. Note that many of the calls are asynchronous and some of the errors may be seen only after synchronization. |
NVJPEG2K_STATUS_NOT_INITIALIZED (1) |
The library handle was not initialized. |
NVJPEG2K_STATUS_INVALID_PARAMETER (2) |
Wrong parameter was passed. For example, a null pointer as input data, or an invalid enum value |
NVJPEG2K_STATUS_BAD_JPEG (3) |
Cannot parse the JPEG2000 stream. Likely due to a corruption that cannot be handled |
NVJPEG2K_STATUS_JPEG_NOT_SUPPORTED (4) |
Attempting to decode a JPEG2000 stream that is not supported by the nvJPEG2000 library. |
NVJPEG2K_STATUS_ALLOCATOR_FAILURE (5) |
The user-provided allocator functions, for either memory allocation or for releasing the memory, returned a non-zero code. |
NVJPEG2K_STATUS_EXECUTION_FAILED (6) |
Error during the execution of the device tasks. |
NVJPEG2K_STATUS_ARCH_MISMATCH (7) |
The device capabilities are not enough for the set of input parameters provided. |
NVJPEG2K_STATUS_INTERNAL_ERROR (8) |
Unknown error occured in the library. |
NVJPEG2K_STATUS_IMPLEMENTATION_NOT_SUPPORTED (9) |
API is not supported by the backend. |
Device Allocator Interface¶
typedef int (*nvjpeg2kDeviceMalloc)(void**, size_t);
typedef int (*nvjpeg2kDeviceFree)(void*);
typedef struct
{
nvjpeg2kDeviceMalloc device_malloc;
nvjpeg2kDeviceFree device_free;
} nvjpeg2kDeviceAllocator_t;
When the nvjpeg2kDeviceAllocator_t *allocator parameter in the nvjpeg2kCreate() function is set as a pointer to the above nvjpeg2kDeviceAllocator_t structure, then this structure is used for allocating and releasing the device memory. The function prototypes for the memory allocation and memory freeing functions are similar to the cudaMalloc() and cudaFree() functions. They should return 0 in case of success, and non-zero otherwise.
However, if the nvjpeg2kDeviceAllocator_t *allocator parameter in the nvjpeg2kCreate() function is set to NULL, then the default memory allocation functions cudaMalloc() and cudaFree() will be used. When using nvjpeg2kCreateSimple() function to create library handle the default device memory allocator will be used.
Pinned Allocator Interface¶
typedef int (*nvjpeg2kPinnedMalloc)(void**, size_t, unsigned int flags);
typedef int (*nvjpeg2kPinnedFree)(void*)
typedef struct
{
nvjpeg2kPinnedMalloc pinned_malloc;
nvjpeg2kPinnedFree pinned_free;
} nvjpeg2kPinnedAllocator_t;
When the nvjpegPinnedAllocator_t *allocator parameter in the nvjpeg2kCreate() function is set as a pointer to the above nvjpegPinnedAllocator_t structure, then this structure will be used for allocating and releasing host pinned memory for copying data to/from device. The function prototypes for the memory allocation and memory freeing functions are similar to cudaHostAlloc() and cudaFreeHost() functions. They will return 0 in case of success, and non-zero otherwise.
However, if the nvjpeg2kPinnedAllocator_t *allocator parameter in the nvjpeg2kCreate() function is set to NULL, then the default memory allocation functions cudaHostAlloc() and cudaFreeHost() will be used. When using nvjpegCreateSimple() function to create library handle, the default host pinned memory allocator will be used.
Library Backend¶
typedef enum
{
NVJPEG2K_BACKEND_DEFAULT = 0
} nvjpeg2kBackend_t;
The nvjpeg2kBackend_t enum allows the user the option to select a different internal implementation. A single implementation is currently supported. Additional implementations may be added in the future.
Component Information¶
typedef struct
{
uint32_t component_width;
uint32_t component_height;
uint8_t precision;
uint8_t sgn;
} nvjpeg2kImageComponentInfo_t;
nvjpeg2kImageComponentInfo_t is used to retrieve component level information. This information can be used for allocating output buffers.
Image Information¶
typedef struct
{
uint32_t image_width;
uint32_t image_height;
uint32_t tile_width;
uint32_t tile_height;
uint32_t num_tiles_x;
uint32_t num_tiles_y;
uint32_t num_components;
} nvjpeg2kImageInfo_t;
nvjpeg2kImageInfo_t is used to retrieve image information which can be used to allocate output buffers.
Image Type¶
typedef enum
{
NVJPEG2K_UINT8 = 0,
NVJPEG2K_UINT16 = 1
} nvjpeg2kImageType_t;
nvjpeg2kImageType_t describes the pixel data types supported by nvJPEG2000.
Image Data¶
typedef struct
{
void **pixel_data;
size_t *pitch_in_bytes;
nvjpeg2kImageType_t pixel_type;
uint32_t num_components;
} nvjpeg2kImage_t;
nvjpeg2kImage_t serves as an image container. It contains an array of void pointers. Each pointer corresponds to a component in the nvjpeg2k bitstream
There is another corresponding array which defines the pitch of each component. pixel_type determines the data type of
pixel_data. See Image Type for supported types.
Library Handle¶
struct nvjpeg2kHandle;
typedef struct nvjpeg2kHandle* nvjpeg2kHandle_t;
This handle should be instantiated prior to using the any of the APIs. It is thread safe, and can be used by multiple threads simultaneously.
Decoder State¶
struct nvjpeg2kDecodeState;
typedef struct nvjpeg2kDecodeState* nvjpeg2kDecodeState_t;
The nvjpeg2kDecodeState_t handle stores intermediate decode information. This handle can be reused when decoding multiple images. User has to ensure that a stream or device synchronize CUDA call is made between the decoding of two images.
Bitstream Handle¶
struct nvjpeg2kStream;
typedef struct nvjpeg2kStream* nvjpeg2kStream_t;
This handle is used for parsing the bitstream. Bitstream metadata can be extracted by using the APIs defined in Parser API Reference.
API Reference¶
Helper API Reference¶
nvjpeg2kGetCudartProperty()¶
Gets the numeric value for the major version, minor version, or the patch level of the CUDA toolkit that was used to build nvJPEG2000 library
Signature:
nvjpeg2kStatus_t NVJPEG2KAPI nvjpeg2kGetCudartProperty(libraryPropertyType type, int *value);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
libraryPropertyType type |
Input |
Host |
One of the supported libraryPropertyType values, that is, MAJOR_VERSION, MINOR_VERSION or PATCH_LEVEL |
int *value |
Output |
Host |
The numeric value corresponding to the specific libraryPropertyType requested. |
nvjpeg2kGetProperty()¶
Gets the numeric value for the major or minor version, or the patch level, of the nvJPEG2000 library.
Signature:
nvjpeg2kStatus_t NVJPEG2KAPI nvjpeg2kGetProperty(libraryPropertyType type, int *value);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
libraryPropertyType type |
Input |
Host |
One of the supported libraryPropertyType values, that is, MAJOR_VERSION, MINOR_VERSION or PATCH_LEVEL |
int *value |
Output |
Host |
The numeric value corresponding to the specific libraryPropertyType requested. |
nvjpeg2kCreateSimple()¶
Creates an instance of the library handle with default backend and memory allocators.
Signature:
nvjpeg2kStatus_t nvjpeg2kCreateSimple(nvjpeg2kHandle_t *handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kHandle_t *handle |
Input/Output |
Host |
nvjpeg2k library handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kCreate()¶
Creates an instance of the library using the input arguments. User has flexibility to choose the backend implementation and provide allocators.
Signature:
nvjpeg2kStatus_t nvjpeg2kCreate(
nvjpeg2kBackend_t backend,
nvjpeg2kDeviceAllocator_t *device_allocator,
nvjpeg2kPinnedAllocator_t *pinned_allocator,
nvjpeg2kHandle_t *handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kBackend_t backend |
Input |
Host |
Backend parameter |
nvjpeg2kDeviceAllocator_t *device_allocator |
Input |
Host |
Device allocator. cudaMalloc and cudaFree are used if set to NULL. See Device Allocator Interface |
nvjpeg2kPinnedAllocator_t *pinned_allocator |
Input |
Host |
Pinnned allocator. cudaHostAlloc and cudaHost are used is set to NULL. See Pinned Allocator Interface |
nvjpeg2kHandle_t *handle |
Input/Output |
Host |
nvjpeg2k library handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kDestroy()¶
Releases the nvjpeg2k library handle.
Signature:
nvjpeg2kStatus_t nvjpeg2kDestroy(nvjpeg2kHandle_t handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kHandle_t handle |
Input |
Host |
nvjpeg2k library handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kDecodeStateCreate()¶
Creates an instance of the Decode State.
Signature:
nvjpeg2kStatus_t nvjpeg2kDecodeStateCreate(
nvjpeg2kHandle_t handle,
nvjpeg2kDecodeState_t *decode_state);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kHandle_t handle |
Input |
Host |
nvjpeg2k library handle |
nvjpeg2kDecodeState_t *decode_state |
Input/Output |
Host |
decode state handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kDecodeStateDestroy()¶
Releases the Decode State handle.
Signature:
nvjpeg2kStatus_t nvjpeg2kDecodeStateDestroy(nvjpeg2kDecodeState_t decode_state);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kDecodeState_t decode_state |
Input |
Host |
decode state handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kStreamCreate()¶
Creates an instance of the bitstream handle.
Signature:
nvjpeg2kStatus_t nvjpeg2kStreamCreate(nvjpeg2kStream_t *stream_handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kStream_t *stream_handle |
Input/Output |
Host |
nvjpeg2k bitstream handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kStreamDestroy()¶
Releases the bitstream handle.
Signature:
nvjpeg2kStatus_t nvjpeg2kStreamDestroy(nvjpeg2kStream_t stream_handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kStream_t *stream_handle |
Input |
Host |
nvjpeg2k bitstream handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
Parser API Reference¶
nvjpeg2kStreamParse()¶
This function is the first step in the decoding of a JPEG2000 bitstream. It accepts the bitstream buffer on host memory as input
and parses the JPEG2000 header information. The parsed information is stored in the nvjpeg2kStream_t handle and can be
retrieved by the APIs documented in this section.
Signature:
nvjpeg2kStatus_t nvjpeg2kStreamParse(nvjpeg2kHandle_t handle,
const unsigned char *data,
size_t length,
int save_metadata,
int save_stream,
nvjpeg2kStream_t *stream_handle);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kHandle_t handle |
Input |
Host |
library handle |
const unsigned char *data |
Input |
Host |
bitstream buffer |
size_t length |
Input |
Host |
bitstream size in bytes |
int save_metadata |
Input |
Host |
Set to 0. Added for future use |
int save_stream |
Input |
Host |
Set to 0. Added for future use |
nvjpeg2kStream_t *stream_handle |
Input |
Host |
bitstream handle |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kStreamGetImageInfo()¶
Retrieves the image information defined in nvjpeg2kImageInfo_t. This information is useful in allocating output buffers on device memory.
Signature:
nvjpeg2kStatus_t nvjpeg2kStreamGetImageInfo(nvjpeg2kStream_t stream_handle,
nvjpeg2kImageInfo_t* image_info);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kStream_t *stream_handle |
Input |
Host |
bitstream handle |
nvjpeg2kImageInfo_t* image_info |
Input/Output |
Host |
Pointer to nvjpeg2kImageInfo_t |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
nvjpeg2kStreamGetImageComponentInfo()¶
Retrieves the component level information defined in nvjpeg2kImageComponentInfo_t This information can be used in allocating output buffers on device memory. Component level information is useful when the dimensions vary across components.
Signature:
nvjpeg2kStatus_t nvjpeg2kStreamGetImageComponentInfo(nvjpeg2kStream_t stream_handle,
nvjpeg2kImageComponentInfo_t* component_info,
uint32_t component_id);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kStream_t stream_handle |
Input |
Host |
bitstream handle |
nvjpeg2kImageComponentInfo_t* component_info |
Input/Output |
Host |
Pointer to nvjpeg2kImageInfo_t |
uint32_t component_id |
Input |
Host |
Component index |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes
Decode API Reference¶
nvjpeg2kDecode()¶
Decodes a single image, and writes it to the output device buffers. The function is synchronous with respect to the host.
All GPU tasks will be submitted to the provided stream. Prior to calling this function user must parse the bitstream
using nvjpeg2kStreamParse() such that the bitstream information is stored in jpeg2k_stream.
Signature:
nvjpeg2kStatus_t nvjpeg2kDecode(nvjpeg2kHandle_t handle,
nvjpeg2kDecodeState_t decode_state,
nvjpeg2kStream_t jpeg2k_stream,
nvjpeg2kImage_t* decode_output,
cudaStream_t stream);
Parameters:
Parameter |
Input/Output |
Memory |
Description |
|---|---|---|---|
nvjpeg2kHandle_t handle |
Input |
Host |
library handle |
nvjpeg2kDecodeState_t decode_state |
Input |
Host |
decode state handle |
nvjpeg2kStream_t jpeg2k_stream |
Input |
Host |
nvjpeg2k bitstream handle |
nvjpeg2kImage_t* decode_output |
Input/Output |
Host |
pointer to an instance of component_info |
cudaStream_t stream |
Input |
Host |
Decode output struct. The struct should be on host memory. The image component pointers should point to device memory. See Image Data |
Returns:
nvjpeg2kStatus_t - An error code as specified in API Return Status Codes