NVSHMEM
2.8.0
  • Introduction
    • Key Features
    • Communication Transports
    • Advantages Of NVSHMEM
    • GPU-Initiated Communication And Strong Scaling
  • Using NVSHMEM
    • Example NVSHMEM Program
    • Using the NVSHMEM InfiniBand GPUDirect Async Transport
    • Using NVSHMEM With MPI or OpenSHMEM
    • Compiling NVSHMEM Programs
    • Running NVSHMEM Programs
    • Communication Model
    • Data Consistency
    • Multiprocess GPU Support
    • Building NVSHMEM Applications/Libraries
  • NVSHMEM and the CUDA Model
    • The CUDA Execution Model
      • Work Submission in CUDA
      • The CUDA Abstract Machine
    • Nonlocal Operations and the CUDA Execution Model
      • CUDA Streams and Circular Dependencies
      • CUDA Stream Order and Execution Resources
      • CUDA Streams and False Circular Dependencies
      • Intra-Kernel Synchronization
      • Ensuring Safe Nonlocal Operations Using NVSHMEM Cooperative Kernel Launch
    • Implicitly Asynchronous cudaMemcpy
  • Memory Model
    • Pointers to Symmetric Objects
    • Ordering of Operations
    • Atomicity Guarantees
    • Differences Between NVSHMEM and OpenSHMEM
      • Ordering of Blocking Fetching Operations
      • Visibility Guarantees
  • Execution Model
    • Progress of NVSHMEM Operations
    • Invoking NVSHMEM Operations
  • Library Constants
  • Library Handles
  • Environment Variables
    • Standard options
    • Bootstrap options
    • Additional options
    • Collectives options
    • Transport options
    • NVTX options
  • NVSHMEM APIs
    • Overview of the APIs
      • Unsupported OpenSHMEM 1.3 APIs
      • OpenSHMEM 1.3 APIs Not Supported Over InfiniBand
      • Supported OpenSHMEM APIs (OpenSHMEM 1.4 and 1.5)
      • NVSHMEM API Extensions For CPU Threads
      • NVSHMEM API Extensions For GPU Threads
    • Library Setup, Exit, and Query
      • NVSHMEM_INIT
      • NVSHMEMX_INIT_ATTR
      • NVSHMEMX_CUMODULE_INIT
      • NVSHMEMX_INIT_STATUS
      • NVSHMEM_MY_PE
      • NVSHMEM_N_PES
      • NVSHMEM_FINALIZE
      • NVSHMEM_GLOBAL_EXIT
      • NVSHMEM_PTR
      • NVSHMEM_INFO_GET_VERSION
      • NVSHMEM_INFO_GET_NAME
    • Thread Support
      • NVSHMEM_INIT_THREAD
      • NVSHMEM_QUERY_THREAD
    • Kernel Launch Routines
      • NVSHMEMX_COLLECTIVE_LAUNCH
      • NVSHMEMX_COLLECTIVE_LAUNCH_QUERY_GRIDSIZE
    • Memory Management
      • NVSHMEM_MALLOC, NVSHMEM_FREE, NVSHMEM_ALIGN
      • NVSHMEM_CALLOC
      • Memory Registration
        • NVSHMEMX_BUFFER_REGISTER
        • NVSHMEMX_BUFFER_UNREGISTER
        • NVSHMEMX_BUFFER_UNREGISTER_ALL
    • Team Management
      • Predefined and Application-Defined Teams
      • Team Handles
      • Thread Safety
      • Collective Ordering
      • Team Creation
      • NVSHMEM_TEAM_MY_PE
      • NVSHMEM_TEAM_N_PES
      • NVSHMEM_TEAM_CONFIG_T
      • NVSHMEM_TEAM_GET_CONFIG
      • NVSHMEM_TEAM_TRANSLATE_PE
      • NVSHMEM_TEAM_SPLIT_STRIDED
      • NVSHMEM_TEAM_SPLIT_2D
      • NVSHMEM_TEAM_DESTROY
    • Remote Memory Access
      • Blocking RMA
        • NVSHMEM_PUT
        • NVSHMEM_P
        • NVSHMEM_IPUT
        • NVSHMEM_GET
        • NVSHMEM_G
        • NVSHMEM_IGET
      • Nonblocking RMA
        • NVSHMEM_PUT_NBI
        • NVSHMEM_GET_NBI
    • Atomic Memory Operations
      • NVSHMEM_ATOMIC_FETCH
      • NVSHMEM_ATOMIC_SET
      • NVSHMEM_ATOMIC_COMPARE_SWAP
      • NVSHMEM_ATOMIC_SWAP
      • NVSHMEM_ATOMIC_FETCH_INC
      • NVSHMEM_ATOMIC_INC
      • NVSHMEM_ATOMIC_FETCH_ADD
      • NVSHMEM_ATOMIC_ADD
      • NVSHMEM_ATOMIC_FETCH_AND
      • NVSHMEM_ATOMIC_AND
      • NVSHMEM_ATOMIC_FETCH_OR
      • NVSHMEM_ATOMIC_OR
      • NVSHMEM_ATOMIC_FETCH_XOR
      • NVSHMEM_ATOMIC_XOR
    • Signaling Operations
      • Atomicity Guarantees for Signaling Operations
      • Available Signal Operators
      • NVSHMEM_PUT_SIGNAL
      • NVSHMEM_PUT_SIGNAL_NBI
      • NVSHMEM_SIGNAL_FETCH
      • NVSHMEMX_SIGNAL
      • NVSHMEMX_SIGNAL_OP
    • Collective Communication
      • Team-based collectives
      • Implicit team collectives
      • Error codes returned from team-based collectives
      • NVSHMEM_BARRIER_ALL
      • NVSHMEM_BARRIER
      • NVSHMEM_SYNC
      • NVSHMEM_SYNC_ALL
      • NVSHMEM_ALLTOALL
      • NVSHMEM_BROADCAST
      • NVSHMEM_FCOLLECT
      • NVSHMEM_REDUCTIONS
        • AND
        • OR
        • XOR
        • MAX
        • MIN
        • SUM
        • PROD
    • Point-To-Point Synchronization
      • NVSHMEM_WAIT_UNTIL
      • NVSHMEM_WAIT_UNTIL_ALL
      • NVSHMEM_WAIT_UNTIL_ANY
      • NVSHMEM_WAIT_UNTIL_SOME
      • NVSHMEM_WAIT_UNTIL_ALL_VECTOR
      • NVSHMEM_WAIT_UNTIL_ANY_VECTOR
      • NVSHMEM_WAIT_UNTIL_SOME_VECTOR
      • NVSHMEM_TEST
      • NVSHMEM_TEST_ALL
      • NVSHMEM_TEST_ANY
      • NVSHMEM_TEST_SOME
      • NVSHMEM_TEST_ALL_VECTOR
      • NVSHMEM_TEST_ANY_VECTOR
      • NVSHMEM_TEST_SOME_VECTOR
      • NVSHMEM_SIGNAL_WAIT_UNTIL
    • Memory Ordering
      • NVSHMEM_FENCE
      • NVSHMEM_QUIET
  • Examples
    • Attribute-Based Initialization Example
    • Collective Launch Example
    • On-Stream Example
    • Threadgroup Example
    • Put on Block Example
    • Ring Broadcast Example
  • Troubleshooting And FAQs
    • General FAQs
    • Prerequisite FAQs
    • Running NVSHMEM Programs FAQs
    • Interoperability With MPI FAQs
    • Interoperability With OpenSHMEM FAQs
    • GPU-GPU Interconnection FAQs
    • NVSHMEM API Usage FAQs
    • Debugging FAQs
    • Miscellaneous FAQs
  • NVSHMEM SLA
    • LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS
      • 1. License.
      • 2. Limitations.
      • 3. Ownership.
      • 4. No Warranties.
      • 5. Limitations of Liability.
      • 6. Termination.
      • 7. General.
    • NVSHMEM SUPPLEMENT TO SOFTWARE LICENSE AGREEMENT FOR NVIDIA SOFTWARE DEVELOPMENT KITS
  • Acknowledgements
    • Notices
    • Trademarks
    • Copyright
NVSHMEM
  • »


© Copyright 2022, NVIDIA Corporation. All rights reserved.

Built with Sphinx using a theme provided by Read the Docs.