Python Bindings (NVSHMEM4Py)¶
NVSHMEM4Py is the official Python language binding for NVSHMEM, providing a Pythonic interface to the NVSHMEM library’s functionality. It enables Python applications to leverage NVSHMEM’s high-performance PGAS (Partitioned Global Address Space) programming model for GPU-accelerated computing.
Quick Start¶
To use NVSHMEM4Py in your Python application:
import nvshmem.core as nvshmem
from mpi4py import MPI
from cuda.core.experimental import Device
dev = Device()
dev.set_current()
stream = dev.create_stream()
# Initialize MPI
comm = MPI.COMM_WORLD
# Initialize NVSHMEM with MPI
nvshmem.init(dev, mpi_comm=comm, init_method="mpi")
# Get information about the current PE
my_pe = nvshmem.my_pe()
n_pes = nvshmem.n_pes()
# Allocate symmetric memory
# array() returns a CuPy NDArray object
x = nvshmem.array((1024,), dtype="float32")
y = nvshmem.array((1024,), dtype="float32")
if my_pe == 0:
y[:] = 1.0
# Perform communication operations
# Put y from PE 0 into x on PE 1
if my_pe == 0:
nvshmem.put(x, y, pe=1, stream=stream)
# Synchronize PEs
stream.sync()
# Clean up
nvshmem.free_array(x)
nvshmem.free_array(y)
nvshmem.finalize()
Key Features¶
- Pythonic interface to NVSHMEM functionality
- Seamless integration with NumPy, CuPy, and PyTorch
- Support for symmetric memory allocation and management
- Communication operations (put/get, collectives)
- Synchronization primitives
For more detailed information, see the NVSHMEM4Py Overview and API reference sections.