DGL Release 24.03

The NVIDIA container image for DGL, release 24.03, is available on NGC.

Contents of the DGL container

This container image contains the complete source of the version of DGL in /opt/dgl/dgl-source. It is pre-built and installed as a system Pyton module.

The container includes the following:

DGL 2.1+7c51cd16 (including DGL-Graphbolt, a recently released GNN dataloader library which has achieved state-of-the-art performance on NVIDIA GPUs)
RAPIDS 24.02
This container also contains WholeGraph 24.02 with NVSHMEM support. WholeGraph is a part of the NVIDIA RAPIDS library which provides an underlying graph storage structure to enhance GNN training, especially optimized for NVIDIA hardware.
Built on PyTorch 24.03 (see contents of PyTorch container).

Key Features and Enhancements

This DGL release includes the following key features and enhancements.

DGL container image version 24.03 is based on DGL 2.1+7c51cd16
In this release of the NVIDIA DGL container, we extend to NVSHMEM for distributed feature scatter. See examples located at: /workspace/examples/wholegraph-examples.
Add NVIDIA Synthetic Graph Generation tool for generating graphs with an arbitrary size, including node and edge tabular features.
The major features of the release can be found in the DGL release notes.

Announcements

None.

NVIDIA DGL Container Versions

The following table shows what versions of Ubuntu, CUDA, DGL, and TensorRT are supported in each NVIDIA containers for DGL. For older container versions, refer to the Frameworks Support Matrix.

Container Version	Ubuntu	CUDA Toolkit	DGL	PyTorch
24.03	22.04	NVIDIA CUDA 12.4.0.41	2.1+7c51cd16	24.03
24.01		NVIDIA CUDA 12.3.2	1.2+c660f5c	24.01
23.11		NVIDIA CUDA 12.3.0	1.1.2	24.01
23.09		NVIDIA CUDA 12.2.1	1.1.2	23.09
23.07		NVIDIA CUDA 12.1.1	1.1.1	23.07

Known Issues

When cpu sampling is enabled (use_uva=False and num_workers>0), DGL sampling process would initialize cuda instance (issue-6561), which could result in a segmentation fault with the current cuda driver in the container.
The tensors that are used as node features must be contiguous and cannot be views of other tensors when the use_uva flag is set to True in the dgl.dataloading.Dataloader class.

When you attempt to use a graph with a non-contiguous or view tensors for edata or ndata, aDGLError will occur.