NVIDIA DALI
1.24.0 -163462f
Version select:
  • Home

Getting Started

  • Installation
    • Prerequisites
    • DALI in NGC Containers
    • pip - Official Releases
      • nvidia-dali
      • nvidia-dali-tf-plugin
    • pip - Nightly and Weekly Releases
      • Nightly Builds
      • Weekly Builds
    • pip - Legacy Releases
    • Open Cognitive Environment (Open-CE)
  • Platform Support
  • Getting Started Tutorial
    • Overview
    • Pipeline
      • Defining the Pipeline
      • Building the Pipeline
      • Running the Pipeline
    • Adding Augmentations
      • Random Shuffle
      • Augmentations
      • Tensors as Arguments and Random Number Generation
    • Adding GPU Acceleration
      • Copying Tensors to GPU
        • Important Notice
      • Hybrid Decoding
  • Reporting vulnerabilities
    • Reporting Potential Security Vulnerability in an NVIDIA Product

Python API Documentation

  • Pipeline
    • Data Processing Graphs
      • Processing Graph Structure
    • Current Pipeline
    • Pipeline Decorator
    • DataNode
    • Experimental Pipeline Features
      • Conditional Execution (experimental)
      • Pipeline Debug Mode (experimental)
        • Notice
  • Types
    • TensorList
      • TensorListCPU
      • TensorListGPU
    • Tensor
      • TensorCPU
      • TensorGPU
    • Data Layouts
      • Tensor Layout String format
      • Interpreting Tensor Layout Strings
    • Constant wrapper
      • Constant
    • Enums
      • DALIDataType
      • DALIIterpType
      • DALIImageType
      • SampleInfo
      • BatchInfo
      • TensorLayout
      • PipelineAPIType
  • Mathematical Expressions
    • Type Promotion Rules
    • Supported Arithmetic Operations
    • Broadcasting
      • Shape extension
      • Limitations
    • Mathematical Functions
      • Exponents and logarithms
      • Trigonometric Functions
      • Hyperbolic Functions
  • Indexing and Slicing
    • Indexing
    • Indexing from the end
    • Indexing with run-time values
    • Slicing
    • Multidimensional selection
    • Strided slices
    • Adding dimensions
    • Layout specifiers
  • Operation Reference
    • nvidia.dali.fn
      • nvidia.dali.fn.audio_decoder
      • nvidia.dali.fn.audio_resample
      • nvidia.dali.fn.batch_permutation
      • nvidia.dali.fn.bb_flip
      • nvidia.dali.fn.bbox_paste
      • nvidia.dali.fn.box_encoder
      • nvidia.dali.fn.brightness
      • nvidia.dali.fn.brightness_contrast
      • nvidia.dali.fn.caffe2_reader
      • nvidia.dali.fn.caffe_reader
      • nvidia.dali.fn.cast
      • nvidia.dali.fn.cast_like
      • nvidia.dali.fn.cat
      • nvidia.dali.fn.coco_reader
      • nvidia.dali.fn.coin_flip
      • nvidia.dali.fn.color_space_conversion
      • nvidia.dali.fn.color_twist
      • nvidia.dali.fn.constant
      • nvidia.dali.fn.contrast
      • nvidia.dali.fn.coord_flip
      • nvidia.dali.fn.coord_transform
      • nvidia.dali.fn.copy
      • nvidia.dali.fn.crop
      • nvidia.dali.fn.crop_mirror_normalize
      • nvidia.dali.fn.dl_tensor_python_function
      • nvidia.dali.fn.dump_image
      • nvidia.dali.fn.element_extract
      • nvidia.dali.fn.erase
      • nvidia.dali.fn.expand_dims
      • nvidia.dali.fn.external_source
      • nvidia.dali.fn.fast_resize_crop_mirror
      • nvidia.dali.fn.file_reader
      • nvidia.dali.fn.flip
      • nvidia.dali.fn.gaussian_blur
      • nvidia.dali.fn.get_property
      • nvidia.dali.fn.grid_mask
      • nvidia.dali.fn.hsv
      • nvidia.dali.fn.hue
      • nvidia.dali.fn.image_decoder
      • nvidia.dali.fn.image_decoder_crop
      • nvidia.dali.fn.image_decoder_random_crop
      • nvidia.dali.fn.image_decoder_slice
      • nvidia.dali.fn.jitter
      • nvidia.dali.fn.jpeg_compression_distortion
      • nvidia.dali.fn.laplacian
      • nvidia.dali.fn.lookup_table
      • nvidia.dali.fn.mel_filter_bank
      • nvidia.dali.fn.mfcc
      • nvidia.dali.fn.multi_paste
      • nvidia.dali.fn.mxnet_reader
      • nvidia.dali.fn.nemo_asr_reader
      • nvidia.dali.fn.nonsilent_region
      • nvidia.dali.fn.normal_distribution
      • nvidia.dali.fn.normalize
      • nvidia.dali.fn.numpy_reader
      • nvidia.dali.fn.one_hot
      • nvidia.dali.fn.optical_flow
      • nvidia.dali.fn.pad
      • nvidia.dali.fn.paste
      • nvidia.dali.fn.peek_image_shape
      • nvidia.dali.fn.per_frame
      • nvidia.dali.fn.permute_batch
      • nvidia.dali.fn.power_spectrum
      • nvidia.dali.fn.preemphasis_filter
      • nvidia.dali.fn.python_function
      • nvidia.dali.fn.random_bbox_crop
      • nvidia.dali.fn.random_resized_crop
      • nvidia.dali.fn.reinterpret
      • nvidia.dali.fn.reshape
      • nvidia.dali.fn.resize
      • nvidia.dali.fn.resize_crop_mirror
      • nvidia.dali.fn.roi_random_crop
      • nvidia.dali.fn.rotate
      • nvidia.dali.fn.saturation
      • nvidia.dali.fn.sequence_reader
      • nvidia.dali.fn.sequence_rearrange
      • nvidia.dali.fn.shapes
      • nvidia.dali.fn.slice
      • nvidia.dali.fn.spectrogram
      • nvidia.dali.fn.sphere
      • nvidia.dali.fn.squeeze
      • nvidia.dali.fn.ssd_random_crop
      • nvidia.dali.fn.stack
      • nvidia.dali.fn.tfrecord_reader
      • nvidia.dali.fn.to_decibels
      • nvidia.dali.fn.transpose
      • nvidia.dali.fn.uniform
      • nvidia.dali.fn.video_reader
      • nvidia.dali.fn.video_reader_resize
      • nvidia.dali.fn.warp_affine
      • nvidia.dali.fn.water
    • nvidia.dali.fn.decoders
      • nvidia.dali.fn.decoders.audio
      • nvidia.dali.fn.decoders.image
      • nvidia.dali.fn.decoders.image_crop
      • nvidia.dali.fn.decoders.image_random_crop
      • nvidia.dali.fn.decoders.image_slice
    • nvidia.dali.fn.experimental
      • nvidia.dali.fn.experimental.audio_resample
      • nvidia.dali.fn.experimental.debayer
      • nvidia.dali.fn.experimental.equalize
      • nvidia.dali.fn.experimental.filter
      • nvidia.dali.fn.experimental.inflate
      • nvidia.dali.fn.experimental.peek_image_shape
      • nvidia.dali.fn.experimental.remap
      • nvidia.dali.fn.experimental.tensor_resize
    • nvidia.dali.fn.experimental.decoders
      • nvidia.dali.fn.experimental.decoders.image
      • nvidia.dali.fn.experimental.decoders.image_crop
      • nvidia.dali.fn.experimental.decoders.image_random_crop
      • nvidia.dali.fn.experimental.decoders.image_slice
      • nvidia.dali.fn.experimental.decoders.video
    • nvidia.dali.fn.experimental.inputs
      • nvidia.dali.fn.experimental.inputs.video
    • nvidia.dali.fn.experimental.readers
      • nvidia.dali.fn.experimental.readers.video
    • nvidia.dali.fn.noise
      • nvidia.dali.fn.noise.gaussian
      • nvidia.dali.fn.noise.salt_and_pepper
      • nvidia.dali.fn.noise.shot
    • nvidia.dali.fn.random
      • nvidia.dali.fn.random.coin_flip
      • nvidia.dali.fn.random.normal
      • nvidia.dali.fn.random.uniform
    • nvidia.dali.fn.readers
      • nvidia.dali.fn.readers.caffe
      • nvidia.dali.fn.readers.caffe2
      • nvidia.dali.fn.readers.coco
      • nvidia.dali.fn.readers.file
      • nvidia.dali.fn.readers.mxnet
      • nvidia.dali.fn.readers.nemo_asr
      • nvidia.dali.fn.readers.numpy
      • nvidia.dali.fn.readers.sequence
      • nvidia.dali.fn.readers.tfrecord
      • nvidia.dali.fn.readers.video
      • nvidia.dali.fn.readers.video_resize
      • nvidia.dali.fn.readers.webdataset
    • nvidia.dali.fn.reductions
      • nvidia.dali.fn.reductions.max
      • nvidia.dali.fn.reductions.mean
      • nvidia.dali.fn.reductions.mean_square
      • nvidia.dali.fn.reductions.min
      • nvidia.dali.fn.reductions.rms
      • nvidia.dali.fn.reductions.std_dev
      • nvidia.dali.fn.reductions.sum
      • nvidia.dali.fn.reductions.variance
    • nvidia.dali.fn.segmentation
      • nvidia.dali.fn.segmentation.random_mask_pixel
      • nvidia.dali.fn.segmentation.random_object_bbox
      • nvidia.dali.fn.segmentation.select_masks
    • nvidia.dali.fn.transforms
      • nvidia.dali.fn.transforms.combine
      • nvidia.dali.fn.transforms.crop
      • nvidia.dali.fn.transforms.rotation
      • nvidia.dali.fn.transforms.scale
      • nvidia.dali.fn.transforms.shear
      • nvidia.dali.fn.transforms.translation
    • nvidia.dali.plugin.numba.fn.experimental
      • nvidia.dali.plugin.numba.fn.experimental.numba_function
    • nvidia.dali.plugin.pytorch.fn
      • nvidia.dali.plugin.pytorch.fn.torch_python_function
  • Operator Objects (Legacy)
    • Mapping to Functional API
    • Modules
      • nvidia.dali.ops
      • nvidia.dali.ops._conditional
      • nvidia.dali.ops.decoders
      • nvidia.dali.ops.experimental
      • nvidia.dali.ops.experimental.decoders
      • nvidia.dali.ops.experimental.inputs
      • nvidia.dali.ops.experimental.readers
      • nvidia.dali.ops.noise
      • nvidia.dali.ops.random
      • nvidia.dali.ops.readers
      • nvidia.dali.ops.reductions
      • nvidia.dali.ops.segmentation
      • nvidia.dali.ops.transforms
      • nvidia.dali.plugin.numba.experimental
      • nvidia.dali.plugin.pytorch
  • DL Framework Plugins
    • MXNet
      • MXNet Plugin API reference
      • MXNet Framework
        • Gluon example with DALI
        • ExternalSource operator
        • Using MXNet DALI plugin: using various readers
    • PyTorch
      • PyTorch Plugin API reference
      • Pytorch Framework
        • Using DALI in PyTorch
        • ExternalSource operator
        • Using PyTorch DALI plugin: using various readers
        • Using DALI in PyTorch Lightning
    • TensorFlow
      • TensorFlow Plugin API reference
        • Experimental
      • Tensorflow Framework
        • Using Tensorflow DALI plugin: DALI and tf.data
        • Using Tensorflow DALI plugin: DALI tf.data.Dataset with multiple GPUs
        • Inputs to DALI Dataset with External Source
        • Using Tensorflow DALI plugin with sparse tensors
        • Using Tensorflow DALI plugin: simple example
        • Using Tensorflow DALI plugin: using various readers
    • PaddlePaddle
      • PaddlePaddle Plugin API reference
      • PaddlePaddle Framework
        • Using DALI in PaddlePaddle
        • ExternalSource operator
        • Using Paddle DALI plugin: using various readers

Examples and Tutorials

  • Data Loading
    • External Source Operator - basic usage
      • Define the Data Source
      • Defining the Pipeline
      • Using the Pipeline
        • Interacting with the GPU Input
    • Parallel External Source
      • Accepted source
      • Principle of Operation
      • Example Pipeline and source
        • Adjusting to Callable Object
        • Pipeline Definition
        • Testing the Pipelines
      • Going Parallel
        • Starting Python Workers
      • Running the Pipeline with Spawned Python Workers
        • Serialization of Functions
        • Customizing Serialization
        • Serialization and heavy setup
      • Shuffling and sharding
        • Shuffling dataset between epochs
        • Sharding
      • Parallel external source in containers
    • Parallel External Source - Fork
      • Steps to Start with Fork
      • Example Pipeline and source
        • Callable Object
        • Pipeline Definition
        • Displaying the Results
      • Starting Python Workers
        • Running the Pipelines
    • Data Loading: LMDB Database
      • Overview
      • Caffe LMDB Format
      • Caffe 2 LMDB Format
    • Data loading: MXNet recordIO
      • Overview
      • Creating an Index
      • Defining and Running the Pipeline
    • Data Loading: TensorFlow TFRecord
      • Overview
      • Creating index
      • Defining and Running the Pipeline
    • Data Loading: Webdataset
      • Overview
      • Using readers.webdataset operator
      • Creating an index
      • Defining and running the pipeline
    • COCO Reader
    • Numpy Reader
      • Overview
      • Test data
      • Usage
        • Glob filter
        • Text file with a list of file paths
        • List of file paths
        • Higher dimensionality
      • Region-of-interest (ROI) API
        • ROI start and end, in absolute coordinates
        • ROI start and end, in relative coordinates
        • Specifying a subset of the array’s axes
        • Out of bounds access
      • GPUDirect Storage Support
  • Operations
    • General Purpose
      • DALI Expressions and Arithmetic Operations
        • DALI Expressions and Arithmetic Operators
        • DALI Binary Arithmetic Operators - Type Promotions
        • Custom Augmentations with Arithmetic Operations
        • Conditional-Like Execution and Masking
      • Reduction Operators
      • Tensor Joining
        • Concatenation
        • Stacking
      • Reinterpreting Tensors
        • Fixed Output Shape
        • Reshape with Wildcards
        • Removing and Adding Unit Dimensions
        • Rearranging Dimensions
        • Adding and Removing Dimensions
        • Relative Shape
        • Reinterpreting Data Type
      • Normalize Operator
        • Introduction
        • Using the Normalize Operator
        • Adjusting Output Dynamic Range
        • Externally Provided Parameters
        • Batch Normalization
      • Geometric Transforms
        • Affine Transform
        • Transform Catalogue
        • Case Study: Transforming Keypoints
        • Adding Transforms to the Pipeline
        • Combining Transforms
        • Keypoint Cropping
        • Transform Gallery
      • Erase Operator
    • Image Processing
      • Augmentation Gallery
      • BrightnessContrast Example
        • Brighness and Contrast Adjustment
        • Step-by-Step Guide
      • Color Space Conversion
        • Defining the Pipeline
        • Building and Running the Pipeline
        • Visualizing the Results
      • Image Decoder examples
        • Common code
        • Image Decoder (CPU)
        • Image Decoder (CPU) with Random Cropping Window Size and Anchor
        • Image Decoder with Fixed Cropping Window Size and External Anchor
        • Image Decoder (CPU) with External Window Size and Anchor
        • Image Decoder (Hybrid)
        • Image Decoder (Hybrid) with Random Cropping Window Size and Anchor
        • Image Decoder (Hybrid) with Fixed Cropping Window Size and External Anchor
        • Image Decoder (Hybrid) with External Window Size and Anchor
      • HSV Example
        • Introduction
        • Step-by-Step Guide
      • Using HSV to implement RandomGrayscale operation
      • Interpolation methods
        • Downscaling
        • Upscaling
      • Resize operator
        • Output Size Parameters
        • Scaling Modes
        • Region of Interest (RoI) Processing
        • Fused Flip
        • Input and Output Types
        • Internal Conversion to float
        • Subpixel Scale
      • WarpAffine
        • Introduction
        • Usage Example
        • Example Output
      • 3D Transforms
        • Warp Operators
        • Usage Example
        • Example Output
    • Audio Processing
      • Audio Decoder in DALI
        • Step-by-Step Guide
        • Verification
      • Audio spectrogram
        • Background
        • Reference implementation
        • Calculating the Spectrogram using DALI
        • Mel Spectrogram
        • Mel-Frequency Cepstral Coefficients (MFCCs)
    • Video Processing
      • Simple Video Pipeline Reading From Multiple Files
        • Goal
        • Visualizing the Results
      • Video Pipeline Reading Labelled Videos from a Directory
        • Setting Up
        • Running the Pipeline
        • Visualizing the Results
      • Video Pipeline Demonstrating Applying Labels Based on Timestamps or Frame Numbers
        • Defining the Pipeline
        • Visualizing the Results
      • Reading Video Frames Stored as Images
        • Preparing the Data
        • Frame Sequence Reader
      • Processing video with image processing operators
      • Optical Flow
        • Using Dali
  • Use Cases
    • Video Super-Resolution
      • Dataloaders
      • Data Loader Performance
      • Requirements
      • FlowNet2-SD Implementation and Pre-trained Model
      • Data
      • Training
      • Results on Myanmar Validation Set
      • Reference
    • ImageNet Training in PyTorch
      • Requirements
      • Training
      • Usage
    • Single Shot MultiBox Detector Training in PyTorch
      • Requirements
      • Usage
    • ResNet-N with TensorFlow and DALI
      • Training in Keras Fit/Compile mode
      • Predicting in Keras Fit/Compile mode
      • Training in CTL (Custom Training Loop) mode
      • Predicting in CTL (Custom Training Loop) mode
      • Other useful options
        • Requirements
    • You Only Look Once v4 with TensorFlow and DALI
      • Requirements
      • Usage
        • Training
        • Inference
        • Evaluation
    • EfficientDet with TensorFlow and DALI
      • Preparing data files from COCO dataset
      • Training in Keras Fit/Compile mode
      • Evaluation in Keras Fit/Compile mode
      • Usage
        • Requirements
    • PaddlePaddle Use-Cases
      • ResNet Training in PaddlePaddle
        • Training
        • Usage
      • Single Shot MultiBox Detector Training in PaddlePaddle
        • Requirements
        • Usage
      • Temporal Shift Module Inference in PaddlePaddle
        • Requirements
        • Usage
    • MXNet with DALI - ResNet 50 example
      • Overview
      • ResNet 50 Pipeline
        • The Training Pipeline
        • Using the MXNet Plugin
      • Training with MXNet
    • COCO Reader with Augmentations
    • WebDataset integration using External Source
      • Introduction
        • Data Representation
        • Sharding
      • Sample Implementation
        • Keyword Arguments:
      • Usage presentation
      • Checking consistency
  • Other
    • Multiple GPU Support
      • Overview
      • Run Pipeline on Selected GPU
      • Sharding
    • Conditional execution in DALI
      • Simple example
      • Semantics
        • Requirements
      • Technical details
        • Equivalent pipeline
        • Generators
        • Functions
        • Python statements and tracing
    • Custom Operations
      • Create a Custom Operator in C++
        • Prerequisites
        • Operator Definition
        • CPU Operator Implementation
        • GPU operator implementation
        • Building the Plugin
        • Importing the Plugin
      • Python Operators
        • Defining an Operation
        • Defining a Pipeline
        • Running the Pipeline and Visualizing the Results
        • Variety of Python Operators
        • Limitations of Python Operators
      • Processing GPU Data with Python Operators
        • CuPy Operations
        • Defining a Pipeline
        • Running the Pipeline and Visualizing the Results
        • Advanced: Device Synchronization in the DLTensorPythonFunction
      • Numba Function - Running a Compiled C Callback Function
        • Define the shape function swapping the width and height
        • Define the processing function that fills the output sample based on the input sample
        • Define the Pipeline
    • Serialization
      • Overview
      • Serialization
    • Operator Objects (Legacy)
      • Overview
        • Defining the Pipeline
        • Building the Pipeline
        • Running the Pipeline
      • Adding Augmentations
        • Random Shuffle
        • Augmentations
        • Tensors as Arguments and Random Number Generation
      • Adding GPU Acceleration
        • Copying Tensors to GPU
        • Hybrid Decoding
    • Pipeline Debug Mode (experimental)
      • Overview
      • Notice
      • Defining debug pipeline
      • Additional features
        • Data access
        • Data modification
      • Warning

Advanced

  • Performance Tuning
    • Thread Affinity
    • Memory Consumption
    • Memory Pool Preallocation
    • Freeing Memory Pools
    • Operator Buffer Presizing
    • Prefetching Queue Depth
  • Sharding
    • Framework iterator configuration
      • Enums
    • Shard calculation
  • Pipeline Run Methods
  • Experimental
    • C++ API
  • Compiling DALI from Source
    • Using Docker builder - recommended
      • Prerequisites
      • Building Python Wheel
    • Bare Metal build
      • Prerequisites
      • Build DALI
        • Install Python Bindings
      • Verify the Build (Optional)
        • Obtain Test Data
        • Set Test Data Path
        • Run Tests
      • Building DALI with Clang (Experimental)
      • Optional CMake Build Parameters
    • Cross-compiling for aarch64 Jetson Linux (Docker)
      • Build the aarch64 Jetson Linux Build Container
      • Compile

Frequently Asked Questions

  • Q&A
    • Q: How do I know if DALI can help me?
    • Q: What data formats does DALI support?
    • Q: How does DALI differ from TF, PyTorch, MXNet, or other FWs
    • Q: What to do if DALI doesn’t cover my use case?
    • Q: How to use DALI for inference?
    • Q: How big is the speedup of using DALI compared to loading using OpenCV? Especially for JPEG images.
    • Q: Can you use DALI with DeepStream?
    • Q: How to control the number of frames in a video reader in DALI?
    • Q: Can DALI volumetric data processing work with ultrasound scans?
    • Q: How to debug a DALI pipeline?
    • Q: Can I access the contents of intermediate data nodes in the pipeline?
    • Q: When will DALI support the XYZ operator?
    • Q: How should I know if I should use a CPU or GPU operator variant?
    • Q: How can I provide a custom data source/reading pattern to DALI?
    • Q: Does DALI have any profiling capabilities?
    • Q: Does DALI support multi GPU/node training?
    • Q: How to report an issue/RFE or get help with DALI usage?
    • Q: Can DALI accelerate the loading of the data, not just processing?
    • Q: How can I obtain DALI?
    • Q: Which OS does DALI support?
    • Q: Where can I find the list of operations that DALI supports?
    • Q: Can I send a request to the Triton server with a batch of samples of different shapes (like files with different lengths)?
    • Q: I have heard about the new data processing framework XYZ, how is DALI better than it?
    • Q: Is DALI compatible with other GPUs?
    • Q: When to use DALI and when RAPIDS?
    • Q: Is Triton + DALI still significantly better than preprocessing on CPU, when minimum latency i.e. batch_size=1 is desired?
    • Q: Are there any examples of using DALI for volumetric data?
    • Q: Where can I find more details on using the image decoder and doing image processing?
    • Q: Does DALI utilize any special NVIDIA GPU functionalities?
    • Q: Can DALI operate without GPU?
    • Q: Can I use DALI in the Triton server through a Python model?
    • Q: Can the Triton model config be auto-generated for a DALI pipeline?
    • Q: How easy is it to integrate DALI with existing pipelines such as PyTorch Lightning?
    • Q: Does DALI typically result in slower throughput using a single GPU versus using multiple PyTorch worker threads in a data loader?
    • Q: Will labels, for example, bounding boxes, be adapted automatically when transforming the image data? For example when rotating/cropping, etc. If so how?
    • Q: How easy is it, to implement custom processing steps? In the past, I had issues with calculating 3D Gaussian distributions on the CPU. Would this be possible using a custom DALI function?
    • Q: Is DALI available in Jetson platforms such as the Xavier AGX or Orin?
    • Q: Is it possible to get data directly from real-time camera streams to the DALI pipeline?
    • Q: What is the advantage of using DALI for the distributed data-parallel batch fetching, instead of the framework-native functions?

Reference

  • Release Notes
  • GitHub
  • Roadmap
NVIDIA DALI
  • DL Framework Plugins
  • PaddlePaddle
  • View page source

PaddlePaddle¶

  • PaddlePaddle Plugin API reference
  • PaddlePaddle Framework
    • Using DALI in PaddlePaddle
    • ExternalSource operator
    • Using Paddle DALI plugin: using various readers
Previous Next

© Copyright 2018-2023, NVIDIA Corporation.

Built with Sphinx using a theme provided by Read the Docs.