Library Setup, Exit, and Query¶
The library setup and query interfaces that initialize and monitor the parallel environment of the PEs.
nvshmem_init allocates and initializes resources used by the NVSHMEM
library. It is a collective operation that all PEs must call before any
other NVSHMEM routine may be called. At the end of the NVSHMEM program
which it initialized, the call to
nvshmem_init must be matched with
a call to
nvshmem_finalize. After the first call to
nvshmem_init, a subsequent call to
nvshmem_init_thread in the same program results in undefined
nvshmemx_init_attr(unsigned int flags, nvshmemx_init_attr_t *attributes)¶
- flags [IN]
- Bitwise OR of operation flags. A value of 0 indicates an initialization that is similar to when
- attributes [IN]
- Additional attributes to be used when initializing the NVSHMEM library.
NVSHMEM provides the
nvshmem_init function, as defined in the OpenSHMEM 1.3
specification. In addition, the
nvshmemx_init_attr function is provided to
support the easy porting of MPI and OpenSHMEM programs to NVSHMEM. It allows
the initialization of NVSHMEM based on an existing MPI communicator or OpenSHMEM
job. This is useful when an application is written to use NVSHMEM in a node
and MPI across nodes, or when an application uses another OpenSHMEM
implementation to manage communication across a symmetric heap in the system
nvshmemx_init_attr function initializes the NVSHMEM library by allocating
the resources that are used by the library and assigning a unique identifier to each PE.
This collective operation should be called by all PEs before any other NVSHMEM
The flags argument can be set to 0 or one of the following values:
- NVSHMEMX_INIT_WITH_MPI_COMM - This flag is used to specify that an MPI communicator is provided by the user.
- NVSHMEMX_INIT_WITH_SHMEM - This flag is used to specify that NVSHMEM is used inside an OpenSHMEM job.
NVSHMEMX_INIT_WITH_MPI_COMM option, the NVSHMEM library is initialized
based on the MPI communicator that is provided with each rank in the MPI
communicator that participates as an NVSHMEM PE. A call to
required before the MPI communicator is destroyed.
Note: Do not make any calls to NVSHMEM routines after the MPI communicator has been destroyed.
NVSHMEMX_INIT_WITH_SHMEM option, the NVSHMEM library is initialized
based on the OpenSHMEM PE underlying each NVSHMEM PE. A call to
shmem_finalize is required before
nvshmem_finalize is called. Do not make
any calls to NVSHMEM routines after
shmem_finalize is called.
The attributes argument is a pointer to a structure type that contains the following fields:
- void *mpi_comm
- A pointer to the MPI communicator handle that will be used as the NVSHMEM world team.
Returns 0 on success or an error code on failure.
This routine returns the PE number of the calling PE. It accepts no
arguments. The result is an integer between
npes is the total number of PEs executing the current program.
Integer - Between
The routine returns the number of PEs running in the program.
Integer - Number of PEs running in the NVSHMEM program.
nvshmem_finalize is a collective operation that ends the NVSHMEM
portion of a program previously initialized by
nvshmem_init_thread and releases all resources used by the NVSHMEM
library. This collective operation requires all PEs to participate in
the call. There is an implicit global barrier in
ensure that pending communications are completed and that no resources
are released until all PEs have entered
nvshmem_finalize must be the last NVSHMEM library call encountered
in the NVSHMEM portion of a program. A call to
release all resources initialized by a corresponding call to
nvshmem_init_thread. All processes that
represent the PEs will still exist after the call to
nvshmem_finalize returns, but they will no longer have access to
resources that have been released.
nvshmem_finalize releases all resources used by the NVSHMEM library
including the symmetric memory heap and pointers initiated by
nvshmem_ptr. This collective operation requires all PEs to
participate in the call, not just a subset of the PEs. The non-NVSHMEM
portion of a program may continue after a call to
by all PEs.
nvshmem_ptr(const void *dest, int pe)¶
__device__ void *
nvshmem_ptr(const void *dest, int pe)
- dest [IN]
- The symmetric address of the remotely accessible data object to be referenced.
- pe [IN]
- An integer that indicates the PE number on which
destis to be accessed.
nvshmem_ptr returns an address that may be used to directly
dest on the specified PE. This address can be assigned to
a pointer. After that, ordinary loads and stores to
dest may be
performed. The address returned by
nvshmem_ptr is a local address to
a remotely accessible data object. Providing this address to an argument
of an NVSHMEM routine that requires a symmetric address results in
nvshmem_ptr routine can provide an efficient means to accomplish
communication, for example when a sequence of reads and writes to a data
object on a remote PE does not match the access pattern provided in an
NVSHMEM data transfer routine like
A local pointer to the remotely accessible
dest data object is
returned when it can be accessed using memory loads and stores.
Otherwise, a null pointer is returned.
dest is the address of the referenced
symmetric data object on the calling PE.
nvshmem_info_get_version(int *major, int *minor)¶
nvshmem_info_get_version(int *major, int *minor)
- major [OUT]
- The major version of the OpenSHMEM Specification in use.
- minor [OUT]
- The minor version of the OpenSHMEM Specification in use.
This routine returns the major and minor version of the OpenSHMEM
Specification in use. For a given library implementation, the major and
minor version returned by these calls are consistent with the library
- name [OUT]
- The vendor defined string.
This routine returns the vendor defined name string of size defined by
the library constant
NVSHMEM_MAX_NAME_LEN. The program calling this
function provides the
name memory buffer of at least size
NVSHMEM_MAX_NAME_LEN. The implementation copies the vendor defined
string of size at most
name. In C/C++,
the string is terminated by a null character. If the
buffer is provided with size less than
behavior is undefined. For a given library implementation, the vendor
string returned is consistent with the library constant
This section specifies the interaction between the NVSHMEM interfaces and user threads. It also describes the routines that can be used for initializing and querying the thread environment. There are four levels of threading defined by the NVSHMEM specification.
- The NVSHMEM program must not be multithreaded.
- The NVSHMEM program may be multithreaded. However, the program must
ensure that only the main thread invokes the NVSHMEM interfaces. The
main thread is the thread that invokes either
- The NVSHMEM program may be multithreaded. However, the program must ensure that the NVSHMEM interfaces are not invoked concurrently by multiple threads.
- The NVSHMEM program may be multithreaded and any thread may invoke the NVSHMEM interfaces.
The thread level constants must have increasing integer values; i.e.,
SHMEM_THREAD_MULTIPLE. The following
semantics apply to the usage of these models:
- In the
NVSHMEM_THREAD_MULTIPLEthread levels, the
nvshmem_finalizecalls must be invoked by the same thread.
- Any NVSHMEM operation initiated by a thread is considered an action
of the PE as a whole. The symmetric heap and symmetric variables
scope are not impacted by multiple threads invoking the NVSHMEM
interfaces. Each PE has a symmetric heap that is shared by all
threads within that PE. For example, a thread invoking a memory
allocation routine such as
nvshmem_mallocallocates memory that is accessible by all threads of the PE. The requirement that the same symmetric heap operations must be executed by all PEs in the same order also applies in a threaded environment. Similarly, the completion of collective operations is not impacted by multiple threads. For example,
nvshmem_barrier_allis completed when all PEs enter and exit the
nvshmem_barrier_allcall, even though only one thread in the PE is participating in the collective call.
- Blocking NVSHMEM calls will only block the calling thread, allowing other threads, if available, to continue executing. The calling thread will be blocked until the event on which it is waiting occurs. Once the blocking call is completed, the thread is ready to continue execution. A blocked thread will not prevent progress of other threads on the same PE and will not prevent them from executing other NVSHMEM calls when the thread level permits. In addition, a blocked thread will not prevent the progress of NVSHMEM calls performed on other PEs.
- In the
NVSHMEM_THREAD_MULTIPLEthread level, all NVSHMEM calls are thread-safe. That is, any two concurrently running threads may make NVSHMEM calls.
- In the
SHMEM_THREAD_MULTIPLEthread levels, if multiple threads call collective routines, including the symmetric heap management routines, it is the programmer’s responsibility to ensure the correct ordering of collective calls. Note that NVSHMEM does not currently support multiple threads performing concurrent collective operations, including the NVSHMEM collective memory management operations.
nvshmem_init_thread(int requested, int *provided)¶
- requested [IN]
- The thread level support requested by the user.
- provided [OUT]
- The thread level support provided by the NVSHMEM implementation.
nvshmem_init_thread initializes the NVSHMEM library in the same way
nvshmem_init. In addition,
nvshmem_init_thread also performs
the initialization required for supporting the provided thread level.
requested is used to specify the desired level of
thread support. The argument
provided returns the support level
provided by the library. The allowed values for
An NVSHMEM program is initialized either by
nvshmem_init_thread. Once an NVSHMEM library initialization call has
been performed, a subsequent initialization call in the same program
results in undefined behavior. If the call to
unsuccessful in allocating and initializing resources for the NVSHMEM
library, then the behavior of any subsequent call to the NVSHMEM library
nvshmem_init_thread returns 0 upon success; otherwise, it returns a
The NVSHMEM library can be initialized either by
nvshmem_init_thread. If the NVSHMEM library is initialized by
nvshmem_init, the library implementation can choose to support any
one of the defined thread levels.
- provided [OUT]
- The thread level support provided by the NVSHMEM implementation.
nvshmem_query_thread call returns the level of thread support
currently being provided. The value returned will be same as was
provided by a call to
nvshmem_init_thread, if the
NVSHMEM library was initialized by
nvshmem_init_thread. If the
library was initialized by
nvshmem_init, the implementation can
choose to provide any one of the defined thread levels, and
nvshmem_query_thread returns this thread level.