Advanced Multi-image: SubIFDs and Pagination#

This notebook demonstrates advanced TIFF features for working with complex multi-image files:

  • SubIFDs (Sub Image File Directories): Access embedded images like thumbnails or reduced-resolution pyramid levels

  • Pagination: Process large multi-image files efficiently by parsing images in batches

These features are particularly useful for:

  • Pyramidal TIFFs (whole-slide imaging, geospatial)

  • OME-TIFF (microscopy with multiple resolution levels)

  • Large multi-page TIFFs where you want to process images incrementally

Setup#

[1]:
import os
import numpy as np
from matplotlib import pyplot as plt
from nvidia import nvimgcodec
[2]:
resources_dir = os.getenv("PYNVIMGCODEC_EXAMPLES_RESOURCES_DIR", "../assets/images/")

# Use GPU_ONLY backend (some OME-TIFF image features are not supported with libtiff yet)
decoder = nvimgcodec.Decoder(backends=[nvimgcodec.BackendKind.GPU_ONLY])

Part 1: SubIFD Access (Thumbnails)#

SubIFDs are additional Image File Directories embedded within a TIFF page. Common uses include:

  • Thumbnails for quick preview

  • Reduced resolution levels for pyramidal images

  • Mask or alpha channels

You can access SubIFD offsets directly via the subifd_offsets property on any CodeStream, then use the bitstream_offset parameter to access the SubIFD image.

[3]:
# Load a TIFF with a thumbnail SubIFD
cat_path = os.path.join(resources_dir, "cat_with_thumbnail.tiff")
cs = nvimgcodec.CodeStream(cat_path)

print(f"Main image: {cs.width}x{cs.height}, {cs.num_channels} channels")
print(f"Number of images (main IFDs): {cs.num_images}")
Main image: 720x720, 3 channels
Number of images (main IFDs): 1
[4]:
# Decode the main image
main_img = decoder.decode(cs)

plt.figure(figsize=(6, 6))
plt.title(f"Main Image: {main_img.shape}")
plt.imshow(main_img.cpu())
[4]:
<matplotlib.image.AxesImage at 0x7fa45e542890>
../_images/samples_advanced_multi_image_6_1.png
[5]:
# Get SubIFD offsets directly from the CodeStream
subifd_offsets = cs.subifd_offsets
print(f"SubIFD offsets: {subifd_offsets}")

# Access the thumbnail using its offset
thumb_cs = cs.get_sub_code_stream(bitstream_offset=subifd_offsets[0])
print(f"Thumbnail: {thumb_cs.width}x{thumb_cs.height}, {thumb_cs.num_channels} channels")
SubIFD offsets: [1555488]
Thumbnail: 180x180, 3 channels
[6]:
# Decode the thumbnail
thumb_img = decoder.decode(thumb_cs)

# Compare main image and thumbnail side by side
fig, axes = plt.subplots(1, 2, figsize=(12, 6))

axes[0].imshow(main_img.cpu())
axes[0].set_title(f"Main Image: {main_img.shape}")

axes[1].imshow(thumb_img.cpu())
axes[1].set_title(f"Thumbnail (SubIFD): {thumb_img.shape}")

plt.tight_layout()
../_images/samples_advanced_multi_image_8_0.png

Part 2: Pagination#

For large multi-image TIFFs, you may want to process images in batches rather than parsing all IFDs upfront. The pagination feature allows you to:

  1. Limit parsing to N images at a time using limit_images

  2. Get the offset to continue parsing via next_bitstream_offset

  3. Resume parsing from where you left off using bitstream_offset

You can set these parameters either:

  • At CodeStream creation: nvimgcodec.CodeStream(path, limit_images=10)

  • Via get_sub_code_stream: cs.get_sub_code_stream(limit_images=10)

This is efficient for files with hundreds or thousands of images.

[7]:
# Load a large OME-TIFF with many images (non-paginated)
ome_path = os.path.join(resources_dir, "retina_large.ome.tiff")
ome_cs = nvimgcodec.CodeStream(ome_path)

print(f"Total images in file: {ome_cs.num_images}")
print(f"Image dimensions: {ome_cs.width}x{ome_cs.height}")
Total images in file: 192
Image dimensions: 2048x1567
[8]:
# More efficient parsing pattern

# Parse images in batches of 10
# Start with a limited CodeStream
BATCH_SIZE = 10
batch_cs = nvimgcodec.CodeStream(ome_path, limit_images=BATCH_SIZE)

batch_num = 0
while True:
    next_offset = batch_cs.next_bitstream_offset
    print(f"Batch {batch_num}: {batch_cs.num_images} images")

    if next_offset is None:
        print("  -> No more images")
        break

    print(f"  -> Next offset: {next_offset}")
    batch_num += 1

    # Continue from next offset (can use either top-level or get_sub_code_stream)
    batch_cs = nvimgcodec.CodeStream(ome_path, bitstream_offset=next_offset, limit_images=BATCH_SIZE)

    # Stop after a few batches for demo purposes
    if batch_num >= 5:
        print("  (stopping early for demo)")
        break
Batch 0: 10 images
  -> Next offset: 756653
Batch 1: 10 images
  -> Next offset: 8030165
Batch 2: 10 images
  -> Next offset: 16898256
Batch 3: 10 images
  -> Next offset: 22805785
Batch 4: 10 images
  -> Next offset: 29928084
  (stopping early for demo)

Note: image_idx starts counting at the specified bitstream offset, not at the start of the file!

[9]:
# Decode and display a batch of images from where we stopped
NUM_IMAGES = 3
fig, axes = plt.subplots(1, NUM_IMAGES, figsize=(3 * NUM_IMAGES, 3))

# Get the current batch's offset for decoding
current_offset = batch_cs.next_bitstream_offset if batch_num >= 5 else 0
batch_to_decode = nvimgcodec.CodeStream(ome_path, bitstream_offset=current_offset, limit_images=NUM_IMAGES) if current_offset else batch_cs

for i in range(NUM_IMAGES):
    img_cs = batch_to_decode.get_sub_code_stream(image_idx=i)
    img = decoder.decode(img_cs, params=nvimgcodec.DecodeParams(color_spec=nvimgcodec.ColorSpec.UNCHANGED))

    axes[i].imshow(img.cpu())
    axes[i].set_title(f"Image {i}")

plt.tight_layout()
../_images/samples_advanced_multi_image_13_0.png

Part 3: SubIFDs in OME-TIFF (Pyramid Levels)#

OME-TIFF files often contain multiple resolution levels stored as SubIFDs. Each main image may have SubIFDs containing downsampled versions (e.g., 1/2, 1/4 resolution).

This is useful for:

  • Quick thumbnail generation

  • Progressive loading in viewers

  • Memory-efficient processing at lower resolutions

We’ll examine image 31 from the Z-stack, which has clearly visible retinal structures. The subifd_offsets property makes it easy to discover and access pyramid levels.

[10]:
# Get SubIFD offsets for a representative main IFD
# We use image_idx=31 which has more visible content than the first slice
IMAGE_IDX = 31

selected_ifd = ome_cs.get_sub_code_stream(image_idx=IMAGE_IDX)

# Use subifd_offsets property to get pyramid level offsets
subifd_offsets = selected_ifd.subifd_offsets

print(f"Image {IMAGE_IDX} has {len(subifd_offsets)} SubIFDs (pyramid levels)")
print(f"SubIFD offsets: {subifd_offsets}")

# Build pyramid level info
print(f"\nPyramid levels for Z-slice {IMAGE_IDX}:")
print(f"  Level 0 (full): {selected_ifd.width}x{selected_ifd.height}")

for i, offset in enumerate(subifd_offsets):
    level_cs = nvimgcodec.CodeStream(ome_path, bitstream_offset=offset)
    print(f"  Level {i+1} (1/{2**(i+1)}): {level_cs.width}x{level_cs.height}")
Image 31 has 2 SubIFDs (pyramid levels)
SubIFD offsets: [71428925, 89622101]

Pyramid levels for Z-slice 31:
  Level 0 (full): 2048x1567
  Level 1 (1/2): 1024x783
  Level 2 (1/4): 512x391
[11]:
# Decode and display all pyramid levels
num_levels = 1 + len(subifd_offsets)  # Main IFD + SubIFDs
fig, axes = plt.subplots(1, num_levels, figsize=(3 * num_levels, 3))

# Decode main IFD (Level 0)
img = decoder.decode(selected_ifd, params=nvimgcodec.DecodeParams(color_spec=nvimgcodec.ColorSpec.UNCHANGED))
axes[0].imshow(img.cpu(), cmap='gray')
axes[0].set_title(f"Level 0 (full)\n{img.shape[1]}x{img.shape[0]}")

# Decode SubIFDs (Level 1, 2, ...) using bitstream_offset
for i, offset in enumerate(subifd_offsets):
    level_cs = nvimgcodec.CodeStream(ome_path, bitstream_offset=offset)
    img = decoder.decode(level_cs, params=nvimgcodec.DecodeParams(color_spec=nvimgcodec.ColorSpec.UNCHANGED))

    axes[i + 1].imshow(img.cpu(), cmap='gray')
    axes[i + 1].set_title(f"Level {i + 1} (1/{2**(i+1)})\n{img.shape[1]}x{img.shape[0]}")

plt.tight_layout()
../_images/samples_advanced_multi_image_16_0.png
[ ]: