Quick Start Guide#

This section will explain how to use decoder and encoder APIs in a few quick steps. The API details will be covered in the next section.

Note

Link to nvJPEG2000 Samples: NVIDIA/CUDALibrarySamples

JPEG2000 Decode#

The library expects the bitstream to be on host memory and the decoded output will be written to device memory.

Initialize the library handles listed below:

nvjpeg2kStream_t - is used to parse the bitstream and store the bitstream metadata
nvjpeg2kHandle_t - is the nvjpeg2k library handle
nvjpeg2kDecodeState_t - is used to store the work buffers required for decode

nvjpeg2kHandle_t nvjpeg2k_handle;
nvjpeg2kStream_t nvjpeg2k_stream;
nvjpeg2kDecodeState_t decode_state;

nvjpeg2kCreateSimple(&nvjpeg2k_handle);
nvjpeg2kDecodeStateCreate(&decode_state);
nvjpeg2kStreamCreate(&nvjpeg2k_stream);

Read the JPEG2000 bitstream file from disk and store it in the host buffer. The library supports .jp2 files and JPEG2000 codestreams.
Use the nvjpeg2kStreamParse API to parse the bitstream.

size_t length;
unsigned char *bitstream_buffer;  // host or pinned memory
// read the bitstream from and store it in bitstream_buffer;

// content of bitstream buffer should not be overwritten until the decoding is complete
nvjpeg2kStatus_t status = nvjpeg2kStreamParse(nvjpeg2k_handle, bitstream_buffer, length, 0, 0, nvjpeg2k_stream);
// make sure that nvjpeg2kStreamParse returns NVJPEG2K_STATUS_SUCCESS before proceeding to the next step

4. Extract the image dimensions for each component and allocate output memory on the device as shown in the below snippet. The decoded output is stored in nvjpeg2kImage_t. This data structure has the ability to handle data 8 and 16 bit precision outputs. The below snippet demonstrates initialization of nvjpeg2kImage_t for an 8 bit 3 channel image.

#define NUM_COMPONENTS 3
// extract image info
nvjpeg2kImageInfo_t image_info;
nvjpeg2kStreamGetImageInfo(nvjpeg2k_stream, &image_info);

// assuming the decoding of images with 8 bit precision, and 3 components

nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];

for (int c = 0; c < image_info.num_components; c++)
{
    nvjpeg2kStreamGetImageComponentInfo(nvjpeg2k_stream, &image_comp_info[c], c);
}

unsigned char *decode_output[NUM_COMPONENTS];
size_t pitch_in_bytes[NUM_COMPONENTS];

nvjpeg2kImage_t output_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
    cudaMallocPitch(&decode_output[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height );
}

output_image.pixel_data = decode_output;
output_image.pixel_type = NVJPEG2K_UINT8;
output_image.pitch_in_bytes = pitch_in_bytes;

Call nvjpeg2kDecode(). One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls. cudaDeviceSynchronize() is required to complete the decoding process since nvjpeg2kDecode is asychronous with respect to the host.

nvjpeg2kStatus_t status = nvjpeg2kDecode(nvjpeg2k_handle, decode_state, nvjpeg2k_stream, &output_image, 0); // 0 corresponds to cudaStream_t
cudaDeviceSynchronize()

Go to step 2 to decode another image. Once all images are decoded, release the nvJPEG2000 library resources by calling the corresponding destroy APIs.

JPEG2000 Encode#

The library expects the input image to be on device memory in planar format and the compressed output will written to host memory.

Initialize the library handles listed below:

nvjpeg2kEncoder_t - is the nvJPEG2000 encoder handle
nvjpeg2kEncodeState_t - is used to store the encoder work buffers and intermediate results
nvjpeg2kEncodeParams_t - stores various parameters that control the compressed output

nvjpeg2kEncoder_t enc_handle;
nvjpeg2kEncodeState_t enc_state;
nvjpeg2kEncodeParams_t enc_params;

nvjpeg2kEncoderCreateSimple(&enc_handle);
nvjpeg2kEncodeStateCreate(&enc_state);
nvjpeg2kEncodeParamsCreate(&enc_params);

2. Copy the input image to device memory in planar format. Store the image buffer pointers in nvjpeg2kImage_t. The below snippet demonstrates initialization of nvjpeg2kImage_t and nvjpeg2kImageComponentInfo_t for a 8 bit 3 channel RGB image.

#define NUM_COMPONENTS 3
unsigned char *pixel_data[NUM_COMPONENTS];
size_t pitch_in_bytes[NUM_COMPONENTS];

nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];

uint32_t image_width  =  // assign image width
uint32_t image_height =  // assign image height

for (int c = 0; c < image_info.num_components; c++)
{
    image_comp_info[c].component_width  = image_width;
    image_comp_info[c].component_height = image_height;
    image_comp_info[c].precision        = 8;
    image_comp_info[c].sgn              = 0;
}

nvjpeg2kImage_t input_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
    cudaMallocPitch(&pixel_data[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height);
    // cudaMallocPitch is used to let cuda deterimine the pitch. cudaMalloc can be used if required.
}

// Copy the image to the device buffers.

input_image.pixel_data = pixel_data;
input_image.pixel_type = NVJPEG2K_UINT8;
input_image.pitch_in_bytes = pitch_in_bytes;

3. Populate the nvjpeg2kEncodeConfig_t structure and call nvjpeg2kEncodeParamsSetEncodeConfig. The below code snippet documents the settings to generate a JPEG2000 bitstream using reversible wavelet transform with 64x64 code block size.

Note

The valid values for each field in nvjpeg2kEncodeConfig_t are documented here.

nvjpeg2kEncodeConfig_t enc_config;
memset(&enc_config, 0, sizeof(enc_config));
enc_config.stream_type      =  NVJPEG2K_STREAM_JP2; // the bitstream will be in JP2 container format
enc_config.color_space      =  NVJPEG2K_COLORSPACE_SRGB; // input image is in RGB format
enc_config.image_width      =  image_width;
enc_config.image_height     =  image_height;
enc_config.num_components   =  NUM_COMPONENTS;
enc_config.image_comp_info  =  &image_comp_info;
enc_config.code_block_w     =  64;
enc_config.code_block_h     =  64;
enc_config.irreversible     =  0
enc_config.mct_mode         =  1;
enc_config.prog_order       =  NVJPEG2K_LRCP;
enc_config.num_resolutions  =  6;

nvjpeg2kStatus_t status = nvjpeg2kEncodeParamsSetEncodeConfig(enc_params, &enc_config);

Note

All nvJPEG2000 APIs should return NVJPEG2K_STATUS_SUCCESS. The results may not be valid otherwise.

For lossy encode, set the target PSNR in decibels(dB) as shown below. It is not required to call this API for lossless encode.

double target_psnr = 50;
status = nvjpeg2kEncodeParamsSetQuality(enc_params, target_psnr));

Call nvjpeg2kEncode. One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls.

status = nvjpeg2kEncode(enc_handle, enc_state, enc_params, &input_image, NULL));

6. Retrieve the compressed JPEG2000 bitstream to host memory as shown in the below code snippet. If a stream identifier is passed, then the API will use it to issue the asynchronous cuda calls. cudaDeviceSynchronize() is required to complete the encoding process since the APIs are asynchronous with respect to the host.

// set the compressed_data buffer to NULL to retrieve the bitstream size
size_t compressed_size;
status = nvjpeg2kEncodeRetrieveBitstream(enc_handle, enc_state, NULL, &compressed_size);

// allocate output buffer
unsigned char *compressed_data = new unsigned char [compressed_size]
status = nvjpeg2kEncodeRetrieveBitstream(enc_handle, enc_state, compressed_data, &compressed_size,
    params.stream));
cudaDeviceSynchronize();

7 Go to step 2 to encode another image. Once all images are encoded, release the nvJPEG2000 library resources by calling the corresponding destroy APIs.