NVSHMEM Release 1.0.1

This is the NVIDIA® NVSHMEM™ 1.0.1 release notes. This is the first official release of NVSHMEM.

Key Features And Enhancements

This NVSHMEM release includes the following key features and enhancements.
  • Combines the memory of multiple GPUs into a partitioned global address space that’s accessed through NVSHMEM APIs.

  • Includes a low-overhead, in-kernel communication API for use by GPU threads.

  • Includes stream-based and CPU-initiated communication APIs.

  • Supports peer-to-peer communication using NVIDIA® NVLink® and PCI Express and for GPU clusters using NVIDIA Mellanox® InfiniBand.

  • Supports x86 and POWER9 processors.

  • Is interoperable with MPI and other OpenSHMEM implementations.


NVSHMEM 1.0.1 has been tested with the following:

Known Issues

  • NVSHMEM and libraries that use NVSHMEM can only be built as static libraries, not as shared libraries. This is because linking of CUDA device symbols does not work across shared libraries.

  • NVSHMEM collective operations with overlapping active sets are known not to work in some scenarios.

  • nvshmem_quiet only ensures PE-PE visibility and not global visibility of data.