Quick Start Guide#
This section will explain how to use decoder and encoder APIs in a few quick steps. The API details will be covered in the next section.
Note
Link to nvJPEG2000 Samples: NVIDIA/CUDALibrarySamples
JPEG2000 Decode#
The library expects the bitstream to be on host memory and the decoded output will be written to device memory.
Initialize the library handles listed below:
nvjpeg2kStream_t
- is used to parse the bitstream and store the bitstream metadatanvjpeg2kHandle_t
- is the nvjpeg2k library handlenvjpeg2kDecodeState_t
- is used to store the work buffers required for decode
nvjpeg2kHandle_t nvjpeg2k_handle;
nvjpeg2kStream_t nvjpeg2k_stream;
nvjpeg2kDecodeState_t decode_state;
nvjpeg2kCreateSimple(&nvjpeg2k_handle);
nvjpeg2kDecodeStateCreate(&decode_state);
nvjpeg2kStreamCreate(&nvjpeg2k_stream);
Read the JPEG2000 bitstream file from disk and store it in the host buffer. The library supports .jp2 files and JPEG2000 codestreams.
Use the nvjpeg2kStreamParse API to parse the bitstream.
size_t length;
unsigned char *bitstream_buffer; // host or pinned memory
// read the bitstream from and store it in bitstream_buffer;
// content of bitstream buffer should not be overwritten until the decoding is complete
nvjpeg2kStatus_t status = nvjpeg2kStreamParse(nvjpeg2k_handle, bitstream_buffer, length, 0, 0, nvjpeg2k_stream);
// make sure that nvjpeg2kStreamParse returns NVJPEG2K_STATUS_SUCCESS before proceeding to the next step
4. Extract the image dimensions for each component and allocate output memory on the device as shown in the below snippet. The decoded output is stored in nvjpeg2kImage_t. This data structure has the ability to handle data 8 and 16 bit precision outputs. The below snippet demonstrates initialization of nvjpeg2kImage_t for an 8 bit 3 channel image.
#define NUM_COMPONENTS 3
// extract image info
nvjpeg2kImageInfo_t image_info;
nvjpeg2kStreamGetImageInfo(nvjpeg2k_stream, &image_info);
// assuming the decoding of images with 8 bit precision, and 3 components
nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];
for (int c = 0; c < image_info.num_components; c++)
{
nvjpeg2kStreamGetImageComponentInfo(nvjpeg2k_stream, &image_comp_info[c], c);
}
unsigned char *decode_output[NUM_COMPONENTS];
size_t pitch_in_bytes[NUM_COMPONENTS];
nvjpeg2kImage_t output_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
cudaMallocPitch(&decode_output[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height );
}
output_image.pixel_data = decode_output;
output_image.pixel_type = NVJPEG2K_UINT8;
output_image.pitch_in_bytes = pitch_in_bytes;
Call nvjpeg2kDecode(). One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls.
cudaDeviceSynchronize()
is required to complete the decoding process sincenvjpeg2kDecode
is asychronous with respect to the host.
nvjpeg2kStatus_t status = nvjpeg2kDecode(nvjpeg2k_handle, decode_state, nvjpeg2k_stream, &output_image, 0); // 0 corresponds to cudaStream_t
cudaDeviceSynchronize()
Go to step 2 to decode another image. Once all images are decoded, release the nvJPEG2000 library resources by calling the corresponding destroy APIs.
JPEG2000 Encode#
The library expects the input image to be on device memory in planar format and the compressed output will written to host memory.
Initialize the library handles listed below:
nvjpeg2kEncoder_t
- is the nvJPEG2000 encoder handlenvjpeg2kEncodeState_t
- is used to store the encoder work buffers and intermediate resultsnvjpeg2kEncodeParams_t
- stores various parameters that control the compressed output
nvjpeg2kEncoder_t enc_handle;
nvjpeg2kEncodeState_t enc_state;
nvjpeg2kEncodeParams_t enc_params;
nvjpeg2kEncoderCreateSimple(&enc_handle);
nvjpeg2kEncodeStateCreate(&enc_state);
nvjpeg2kEncodeParamsCreate(&enc_params);
2. Copy the input image to device memory in planar format. Store the image buffer pointers in nvjpeg2kImage_t
.
The below snippet demonstrates initialization of nvjpeg2kImage_t and nvjpeg2kImageComponentInfo_t for a 8 bit 3 channel RGB image.
#define NUM_COMPONENTS 3
unsigned char *pixel_data[NUM_COMPONENTS];
size_t pitch_in_bytes[NUM_COMPONENTS];
nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];
uint32_t image_width = // assign image width
uint32_t image_height = // assign image height
for (int c = 0; c < image_info.num_components; c++)
{
image_comp_info[c].component_width = image_width;
image_comp_info[c].component_height = image_height;
image_comp_info[c].precision = 8;
image_comp_info[c].sgn = 0;
}
nvjpeg2kImage_t input_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
cudaMallocPitch(&pixel_data[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height);
// cudaMallocPitch is used to let cuda deterimine the pitch. cudaMalloc can be used if required.
}
// Copy the image to the device buffers.
input_image.pixel_data = pixel_data;
input_image.pixel_type = NVJPEG2K_UINT8;
input_image.pitch_in_bytes = pitch_in_bytes;
3. Populate the nvjpeg2kEncodeConfig_t
structure and call nvjpeg2kEncodeParamsSetEncodeConfig
. The below code snippet
documents the settings to generate a JPEG2000 bitstream using reversible wavelet transform with 64x64 code block size.
Note
The valid values for each field in nvjpeg2kEncodeConfig_t
are documented here.
nvjpeg2kEncodeConfig_t enc_config;
memset(&enc_config, 0, sizeof(enc_config));
enc_config.stream_type = NVJPEG2K_STREAM_JP2; // the bitstream will be in JP2 container format
enc_config.color_space = NVJPEG2K_COLORSPACE_SRGB; // input image is in RGB format
enc_config.image_width = image_width;
enc_config.image_height = image_height;
enc_config.num_components = NUM_COMPONENTS;
enc_config.image_comp_info = &image_comp_info;
enc_config.code_block_w = 64;
enc_config.code_block_h = 64;
enc_config.irreversible = 0
enc_config.mct_mode = 1;
enc_config.prog_order = NVJPEG2K_LRCP;
enc_config.num_resolutions = 6;
nvjpeg2kStatus_t status = nvjpeg2kEncodeParamsSetEncodeConfig(enc_params, &enc_config);
Note
All nvJPEG2000 APIs should return NVJPEG2K_STATUS_SUCCESS. The results may not be valid otherwise.
For lossy encode, set the target PSNR in decibels(dB) as shown below. It is not required to call this API for lossless encode.
double target_psnr = 50;
status = nvjpeg2kEncodeParamsSetQuality(enc_params, target_psnr));
Call
nvjpeg2kEncode
. One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls.
status = nvjpeg2kEncode(enc_handle, enc_state, enc_params, &input_image, NULL));
6. Retrieve the compressed JPEG2000 bitstream to host memory as shown in the below code snippet. If a stream identifier is passed, then the API will use it to issue the asynchronous cuda calls. cudaDeviceSynchronize() is required to complete the encoding process since the APIs are asynchronous with respect to the host.
// set the compressed_data buffer to NULL to retrieve the bitstream size
size_t compressed_size;
status = nvjpeg2kEncodeRetrieveBitstream(enc_handle, enc_state, NULL, &compressed_size);
// allocate output buffer
unsigned char *compressed_data = new unsigned char [compressed_size]
status = nvjpeg2kEncodeRetrieveBitstream(enc_handle, enc_state, compressed_data, &compressed_size,
params.stream));
cudaDeviceSynchronize();
7 Go to step 2 to encode another image. Once all images are encoded, release the nvJPEG2000 library resources by calling the corresponding destroy APIs.