NVIDIA® NVSHMEM 3.5.21 Release Notes#
NVSHMEM is an implementation of the OpenSHMEM specification for NVIDIA GPUs. The NVSHMEM programming interface implements a Partitioned Global Address Space (PGAS) model across a cluster of NVIDIA GPUs. NVSHMEM provides an easy-to-use interface to allocate memory that is symmetrically distributed across the GPUs. In addition to a CPU-side interface, NVSHMEM provides a NVIDIA® CUDA® kernel-side interface that allows CUDA threads to access any location in the symmetrically-distributed memory.
The release notes describe the key features, software enhancements and improvements, and known issues for NVSHMEM 3.5.21 and earlier releases.
Key Features and Enhancements#
This NVSHMEM release includes the following key features and enhancements:
Fixed a bug that was related to ABI compatibility breakage for the internal team structure.
NVSHMEM4Py release 0.2.2 includes the following:
Removed an incorrect assumption that any NVSHMEM4Py-managed buffers will have at most one child buffer (peer or multicast)
Compatibility#
NVSHMEM 3.5.21 has been tested with the following:
NCCL:
2.28.3
CUDA Toolkit:
12.4
12.9
13.0
13.1
CPUs:
x86 and NVIDIA Grace™ processors
GPUs:
NVIDIA Ampere A100
NVIDIA Hopper™
NVIDIA Blackwell
Limitations#
Same as 3.5.19
Known Issues#
The internal layout of RC-connected QPs changed starting in 3.5.19 causing ABI compatibility breakage when enabling IBGDA.