NVIDIA Optimized Frameworks

DGL Release 25.08

This DGL container release is intended for use on the NVIDIA® Hopper Architecture GPU, NVIDIA H100, the NVIDIA® Ampere Architecture GPU, NVIDIA A100, and the associated NVIDIA CUDA® 12 and NVIDIA cuDNN 9 libraries.

Note:

Deprecation notice: DGL containers will no longer be supported after the 25.08 release. For continued accelerated and optimized Graph Neural Network workloads, NVIDIA recommends migrating to the NVIDIA PyG container. https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pyg.


Contents of the DGL container

This container image contains the complete source of the version of DGL in /opt/dgl/dgl-source. It is pre-built and installed as a system Pyton module.

The container includes the following:

  • DGL 2.4.
  • RAPIDS 25.06
  • This container also contains WholeGraph 24.08 with NVSHMEM support. WholeGraph is a part of the NVIDIA RAPIDS library which provides an underlying graph storage structure to enhance GNN training, especially optimized for NVIDIA hardware.

GPU Requirements

Release 25.08 supports CUDA compute capability 6.0 and later. This corresponds to GPUs in the NVIDIA Pascal, NVIDIA Volta™, NVIDIA Turing™, NVIDIA Ampere architecture, and NVIDIA Hopper™ architecture families. For a list of GPUs to which this compute capability corresponds, see CUDA GPUs. For additional support details, see Deep Learning Frameworks Support Matrix.

Key Features and Enhancements

This DGL release includes the following key features and enhancements.

  • In this release of the NVIDIA DGL container, NVIDIA supports NVSHMEM for distributed feature scatter. See the examples located at: /workspace/examples/wholegraph-examples.
  • Add NVIDIA Synthetic Graph Generation tool for generating graphs with an arbitrary size, including node and edge tabular features.

Announcements

NVIDIA DGL Container Versions

The following table shows what versions of Ubuntu, CUDA, DGL, and TensorRT are supported in each NVIDIA containers for DGL. For older container versions, refer to the Frameworks Support Matrix.

Container VersionUbuntuCUDA ToolkitDGLPyTorch
25.0824.04NVIDIA CUDA 13.0.0.0442.525.08
25.05NVIDIA CUDA 12.9.02.525.05
25.03NVIDIA CUDA 12.8.12.525.03
25.01NVIDIA CUDA 12.8.02.525.01
24.11NVIDIA CUDA 12.6.22.524.11
24.0922.04NVIDIA CUDA 12.6.12.424.09
24.07NVIDIA CUDA 12.5.12.324.07
24.05NVIDIA CUDA 12.4.12.224.05
24.042.1+e1f773824.04
24.03NVIDIA CUDA 12.4.0.412.1+7c51cd1624.03
24.01NVIDIA CUDA 12.3.21.2+c660f5c24.01
23.11NVIDIA CUDA 12.3.01.1.223.11
23.09NVIDIA CUDA 12.2.11.1.223.09
23.07NVIDIA CUDA 12.1.11.1.123.07

Known Issues

  • Please refer to the CUDA DL Release Notes for additional details.
  • When cpu sampling is enabled (use_uva=False and num_workers>0), DGL sampling process would initialize cuda instance (issue-6561), which could result in a segmentation fault with the current cuda driver in the container.
  • The tensors that are used as node features must be contiguous and cannot be views of other tensors when the use_uva flag is set to True in the dgl.dataloading.Dataloader class.

    When you attempt to use a graph with a non-contiguous or view tensors for edata or ndata, aDGLError will occur.

© Copyright 2025, NVIDIA. Last updated on Aug 21, 2025.