Introduction

nvTIFF is a GPU accelerated TIFF(Tagged Image File Format) encode/decode library built on the CUDA platform. The library is supported on Volta+ GPU architectures. It supports the following TIFF feature set:

Note

Throughout this document, the terms “CPU” and “Host” are used synonymously. Similarly, the terms “GPU” and “Device” are synonymous.

Decoder

  • Planar Separate and Contiguous modes

  • Samples per pixel

    • Up to 8 samples per pixel in Planar Contiguous mode

    • Up to 4 samples per pixel in Planar Separate mode

  • Compression

    • JPEG(via nvJPEG)

    • Deflate(via nvCOMP)

    • LZW

    • None (uncompressed)

  • Color space can be - Grayscale, RGB, YCbCr, RGB Palette. When compressed data is in YCbCr or Palette mode, the library will convert the decoded output to RGB colorspace.

  • TIFF files can use either tiles or strips.

  • Up to 32 bits per sample when compression type is : None, Deflate, LZW. Up to 8 bits per sample when compression type is JPEG.

  • TIFFs with multiple images having different properties.

  • APIs to retrieve GeoTIFF Metadata

The below diagram represents nvTIFF decoder’s interaction with other cuda libraries such as nvJPEG and nvCOMP (for DEFLATE decompression). The user application will call cuda APIs to create decode output buffers prior to calling nvTIFF decoder.

nvtiff decoder overview

nvTiff decoder overview

Encoder

  • Planar Contiguous mode only.

  • Up to 4 samples per pixel.

  • LZW compression.

  • Compressed data is organized in strips.

  • Up to 32 bits per sample.

  • Multiple Images in a TIFF file. All images which are to be compressed must have identical properties.

nvtiff encoder overview

nvTiff encoder overview

Applying GPU Acceleration to TIFF files

A TIFF file may contain single or multiple images. Each of these images are subdivided into strips or tiles. Each of these strips/tiles can be encoded/decoded in parallel thereby providing speed up over CPU implementations.

When decoding TIFF files with multiple images with identical metadata, the strips/tiles across all images can be decoded as part of a single batched CUDA kernel. The converse is true for encoding, each strip/tile can be compressed in parallel. The compressed tiles/strips can be stitched to create a TIFF file

Prerequisites

  • CUDA Toolkit version 11.0 and above.

  • CUDA Driver version r450 and above.

  • nvCOMP 2.6+ (required when compression is deflate).

Platforms Supported

  • Linux versions:

Architecture

Distribution Information

Name

Version

Kernel

GCC

GLIBC

x86_64

RHEL/CentOS

9.1

5.14

11.3.1

2.34

8.3

4.18

8.5.0

2.28

7.9

3.10.0

6

2.17

Ubuntu

22.04

5.15.0

11.2.0

2.34

20.04

5.13.0

9.3.0

2.31

OpenSUSE Leap

15.4

5.14.21

7.5.0

2.31

SUSE SLES

15.4

5.14.21

7.5.0

2.31

Debian

11.6

5.10.0

10.2.1

2.31

10.13

4.19.0

8.3.0

2.28

Fedora

37

6.07

12.2.1

2.36

  • Windows versions:

    • Windows 10 and Windows Server 2019

    • WSL