Quick Start Guide¶

This section will explain how to use decoder API in a few quick steps. The API details will be covered in the next section.

JPEG2000 Decode¶

The library expects the bitstream to be on host memory and the decoded output will be written to device memory.

Note

nvJPEG2000 Decode sample can be found under this repository: https://github.com/NVIDIA/CUDALibrarySamples/tree/master/nvJPEG2000

The first step is to initialize the handles used in the decoder workflow:

nvjpeg2kStream_t - is used to parse the bitstream and store the bitstream metadata
nvjpeg2kHandle_t - is the nvjpeg2k library handle
nvjpeg2kDecodeState_t - is used to store the work buffers required for decode

nvjpeg2kHandle_t nvjpeg2k_handle;
nvjpeg2kStream_t nvjpeg2k_stream;
nvjpeg2kDecodeState_t decode_state;

nvjpeg2kCreateSimple(&nvjpeg2k_handle);
nvjpeg2kDecodeStateCreate(&decode_state);
nvjpeg2kStreamCreate(&nvjpeg2k_stream);

Read the jpeg2k bitstream file and store it on the host buffer. The library supports the .jp2 files and jpeg2000 codestreams.
Use the nvjpeg2kStreamParse API to parse the bitstream.

size_t length;
unsigned char *bitstream_buffer;  // host or pinned memory
// read the bitstream and store it in bitstream_buffer;

// content of bitstream buffer should not be overwritten until the decoding is complete
nvjpeg2kStreamParse(nvjpeg2k_handle, bitstream_buffer, length, 0, 0, nvjpeg2k_stream);

4. Extract the image dimensions for each component and allocate output memory on the device as shown in the below snippet. The decoded output is stored in nvjpeg2kImage_t. This data structure has the ability to handle data 8 and 16 bit precision outputs. The below snippet demonstrates initialization of nvjpeg2kImage_t for an 8 bit 3 channel image.

#define NUM_COMPONENTS 3
// extract image info
nvjpeg2kImageInfo_t image_info;
nvjpeg2kStreamGetImageInfo(nvjpeg2k_stream, &image_info);

// assuming the decoding of images with 8 bit precision, and 3 components

nvjpeg2kImageComponentInfo_t image_comp_info[NUM_COMPONENTS];

for (int c = 0; c < image_info.num_components; c++)
{
    nvjpeg2kStreamGetImageComponentInfo(nvjpeg2k_stream, &image_comp_info[c], c);
}

unsigned char *decode_output[NUM_COMPONENTS];
size_t*pitch_in_bytes[NUM_COMPONENTS];

nvjpeg2kImage_t output_image;
for (int c = 0; c < NUM_COMPONENTS; c++)
{
    cudaMallocPitch(&decode_output[c], &pitch_in_bytes[c], image_comp_info[c].comp_width, image_comp_info[c].comp_height );
}

output_image.pixel_data = decode_output;
output_image.pixel_type = NVJPEG2K_UINT8;
output_image.pitch_in_bytes = pitch_in_bytes;

Call nvjpeg2kDecode(). One of the parameters of this function is cudaStream_t. If a stream identifier is passed then the API will use it to issue the asychronous cuda calls. cudaDeviceSynchronize() is required to complete the decoding process since nvjpeg2kDecode is asychronous with respect to the host.

nvjpeg2kDecode(nvjpeg2k_handle, decode_state, nvjpeg2k_stream, &output_image, 0); // 0 corresponds to cudaStream_t
cudaDeviceSynchronize()

Go to step 2 to decode another image. Once the decoding is completed release the nvjpeg2k library resources by calling the corresponding destroy APIs.