Library Setup, Exit, and Query¶
The library setup and query interfaces that initialize and monitor the parallel environment of the PEs.
NVSHMEM_INIT¶
-
void
nvshmem_init
(void)¶
Description
nvshmem_init
allocates and initializes resources used by the NVSHMEM
library. It is a collective operation that all PEs must call before any
other NVSHMEM routine may be called. At the end of the NVSHMEM program
which it initialized, the call to nvshmem_init
must be matched with
a call to nvshmem_finalize
. After the first call to
nvshmem_init
, a subsequent call to nvshmem_init
or
nvshmem_init_thread
in the same program results in undefined
behavior.
Returns
None.
NVSHMEMX_INIT_ATTR¶
-
int
nvshmemx_init_attr
(unsigned int flags, nvshmemx_init_attr_t *attributes)¶
- flags [IN]
- Bitwise OR of operation flags. A value of 0 indicates an initialization that is similar to when
nvshmem_init
is used. - attributes [IN]
- Additional attributes to be used when initializing the NVSHMEM library.
Description
NVSHMEM provides the nvshmem_init
function, as defined in the OpenSHMEM 1.3
specification. In addition, the nvshmemx_init_attr
function is provided to
support the easy porting of MPI and OpenSHMEM programs to NVSHMEM. It allows
the initialization of NVSHMEM based on an existing MPI communicator or OpenSHMEM
job. This is useful when an application is written to use NVSHMEM in a node
and MPI across nodes, or when an application uses another OpenSHMEM
implementation to manage communication across a symmetric heap in the system
memory.
The nvshmemx_init_attr
function initializes the NVSHMEM library by allocating
the resources that are used by the library and assigning a unique identifier to each PE.
This collective operation should be called by all PEs before any other NVSHMEM
routine.
The flags argument can be set to 0 or one of the following values:
- NVSHMEMX_INIT_WITH_MPI_COMM - This flag is used to specify that an MPI communicator is provided by the user.
- NVSHMEMX_INIT_WITH_SHMEM - This flag is used to specify that NVSHMEM is used inside an OpenSHMEM job.
With the NVSHMEMX_INIT_WITH_MPI_COMM
option, the NVSHMEM library is initialized
based on the MPI communicator that is provided with each rank in the MPI
communicator that participates as an NVSHMEM PE. A call to nvshmem_finalize
is
required before the MPI communicator is destroyed.
Note: Do not make any calls to NVSHMEM routines after the MPI communicator has been destroyed.
With the NVSHMEMX_INIT_WITH_SHMEM
option, the NVSHMEM library is initialized
based on the OpenSHMEM PE underlying each NVSHMEM PE. A call to
nvshmem_finalize
is required before shmem_finalize
is called. Do not make
any calls to NVSHMEM routines after nvshmem_finalize
is called.
The attributes argument is a pointer to a structure type that contains the following fields:
- void *mpi_comm
- A pointer to the MPI communicator handle that will be used as the NVSHMEM world team.
Returns
Returns 0 on success or an error code on failure.
Example
NVSHMEM_MY_PE¶
-
int
nvshmem_my_pe
(void)¶
-
__device__ int
nvshmem_my_pe
(void)
Description
This routine returns the PE number of the calling PE. It accepts no
arguments. The result is an integer between 0
and npes
- 1
,
where npes
is the total number of PEs executing the current program.
Returns
Integer - Between 0
and npes
- 1
NVSHMEM_N_PES¶
-
int
nvshmem_n_pes
(void)¶
-
__device__ int
nvshmem_n_pes
(void)
Description
The routine returns the number of PEs running in the program.
Returns
Integer - Number of PEs running in the NVSHMEM program.
NVSHMEM_FINALIZE¶
-
void
nvshmem_finalize
(void)¶
Description
nvshmem_finalize
is a collective operation that ends the NVSHMEM
portion of a program previously initialized by nvshmem_init
or
nvshmem_init_thread
and releases all resources used by the NVSHMEM
library. This collective operation requires all PEs to participate in
the call. There is an implicit global barrier in nvshmem_finalize
to
ensure that pending communications are completed and that no resources
are released until all PEs have entered nvshmem_finalize
.
nvshmem_finalize
must be the last NVSHMEM library call encountered
in the NVSHMEM portion of a program. A call to nvshmem_finalize
will
release all resources initialized by a corresponding call to
nvshmem_init
or nvshmem_init_thread
. All processes that
represent the PEs will still exist after the call to
nvshmem_finalize
returns, but they will no longer have access to
resources that have been released.
Returns
None.
Notes
nvshmem_finalize
releases all resources used by the NVSHMEM library
including the symmetric memory heap and pointers initiated by
nvshmem_ptr
. This collective operation requires all PEs to
participate in the call, not just a subset of the PEs. The non-NVSHMEM
portion of a program may continue after a call to nvshmem_finalize
by all PEs.
NVSHMEM_PTR¶
-
void *
nvshmem_ptr
(const void *dest, int pe)¶
-
__device__ void *
nvshmem_ptr
(const void *dest, int pe)
- dest [IN]
- The symmetric address of the remotely accessible data object to be referenced.
- pe [IN]
- An integer that indicates the PE number on which
dest
is to be accessed.
Description
nvshmem_ptr
returns an address that may be used to directly
reference dest
on the specified PE. This address can be assigned to
a pointer. After that, ordinary loads and stores to dest
may be
performed. The address returned by nvshmem_ptr
is a local address to
a remotely accessible data object. Providing this address to an argument
of an NVSHMEM routine that requires a symmetric address results in
undefined behavior.
The nvshmem_ptr
routine can provide an efficient means to accomplish
communication, for example when a sequence of reads and writes to a data
object on a remote PE does not match the access pattern provided in an
NVSHMEM data transfer routine like nvshmem_put
or nvshmem_iget
.
Returns
A local pointer to the remotely accessible dest
data object is
returned when it can be accessed using memory loads and stores.
Otherwise, a null pointer is returned.
Notes
When calling nvshmem_ptr
, dest
is the address of the referenced
symmetric data object on the calling PE.
NVSHMEM_INFO_GET_VERSION¶
-
void
nvshmem_info_get_version
(int *major, int *minor)¶
-
__device__ void
nvshmem_info_get_version
(int *major, int *minor)
- major [OUT]
- The major version of the OpenSHMEM Specification in use.
- minor [OUT]
- The minor version of the OpenSHMEM Specification in use.
Description
This routine returns the major and minor version of the OpenSHMEM
Specification in use. For a given library implementation, the major and
minor version returned by these calls are consistent with the library
constants NVSHMEM_MAJOR_VERSION
and NVSHMEM_MINOR_VERSION
.
Returns
None.
NVSHMEM_INFO_GET_NAME¶
-
void
nvshmem_info_get_name
(char *name)¶
-
__device__ void
nvshmem_info_get_name
(char *name)
- name [OUT]
- The vendor defined string.
Description
This routine returns the vendor defined name string of size defined by
the library constant NVSHMEM_MAX_NAME_LEN
. The program calling this
function provides the name
memory buffer of at least size
NVSHMEM_MAX_NAME_LEN
. The implementation copies the vendor defined
string of size at most NVSHMEM_MAX_NAME_LEN
to name
. In C/C++,
the string is terminated by a null character. If the name
memory
buffer is provided with size less than NVSHMEM_MAX_NAME_LEN
,
behavior is undefined. For a given library implementation, the vendor
string returned is consistent with the library constant
NVSHMEM_VENDOR_STRING
.
Returns
None.
Thread Support¶
This section specifies the interaction between the NVSHMEM interfaces and user threads. It also describes the routines that can be used for initializing and querying the thread environment. There are four levels of threading defined by the OpenSHMEM specification.
NVSHMEM_THREAD_SINGLE
- The NVSHMEM program must not be multithreaded.
NVSHMEM_THREAD_FUNNELED
- The NVSHMEM program may be multithreaded. However, the program must
ensure that only the main thread invokes the NVSHMEM interfaces. The
main thread is the thread that invokes either
nvshmem_init
ornvshmem_init_thread
. NVSHMEM_THREAD_SERIALIZED
- The NVSHMEM program may be multithreaded. However, the program must ensure that the NVSHMEM interfaces are not invoked concurrently by multiple threads.
NVSHMEM_THREAD_MULTIPLE
- The NVSHMEM program may be multithreaded and any thread may invoke the NVSHMEM interfaces.
The thread level constants must have increasing integer values; i.e.,
NVSHMEM_THREAD_SINGLE
< NVSHMEM_THREAD_FUNNELED
<
NVSHMEM_THREAD_SERIALIZED
< NVSHMEM_THREAD_MULTIPLE
. The
following semantics apply to the usage of these models:
In the
NVSHMEM_THREAD_FUNNELED
,NVSHMEM_THREAD_SERIALIZED
, andNVSHMEM_THREAD_MULTIPLE
thread levels, thenvshmem_init
andnvshmem_finalize
calls must be invoked by the same thread.Any NVSHMEM operation initiated by a thread is considered an action of the PE as a whole. The symmetric heap and symmetric variables scope are not impacted by multiple threads invoking the NVSHMEM interfaces. Each PE has a symmetric heap that is shared by all threads within that PE. For example, a thread invoking a memory allocation routine such as
nvshmem_malloc
allocates memory that is accessible by all threads of the PE. The requirement that the same symmetric heap operations must be executed by all PEs in the same order also applies in a threaded environment. Similarly, the completion of collective operations is not impacted by multiple threads. For example,nvshmem_barrier_all
is completed when all PEs enter and exit thenvshmem_barrier_all
call, even though only one thread in the PE is participating in the collective call.Blocking NVSHMEM calls will only block the calling thread, allowing other threads, if available, to continue executing. The calling thread will be blocked until the event on which it is waiting occurs. Once the blocking call is completed, the thread is ready to continue execution. A blocked thread will not prevent progress of other threads on the same PE and will not prevent them from executing other NVSHMEM calls when the thread level permits. In addition, a blocked thread will not prevent the progress of NVSHMEM calls performed on other PEs.
In the
NVSHMEM_THREAD_MULTIPLE
thread level, all NVSHMEM calls are thread-safe. That is, any two concurrently running threads may make NVSHMEM calls.In the
NVSHMEM_THREAD_SERIALIZED
andNVSHMEM_THREAD_MULTIPLE
thread levels, if multiple threads call collective routines, including the symmetric heap management routines, it is the programmer’s responsibility to ensure the correct ordering of collective calls.NVSHMEM currently supports the
NVSHMEM_THREAD_SERIALIZED
threading level for host and stream APIs andNVSHMEM_THREAD_MULTIPLE
for device APIs.
NVSHMEM_INIT_THREAD¶
-
int
nvshmem_init_thread
(int requested, int *provided)¶
- requested [IN]
- The thread level support requested by the user for host and stream APIs.
- provided [OUT]
- The thread level support provided by the NVSHMEM implementation for host and stream APIs.
Description
nvshmem_init_thread
initializes the NVSHMEM library in the same way
as nvshmem_init
. In addition, nvshmem_init_thread
also performs
the initialization required for supporting the provided thread level.
The argument requested
is used to specify the desired level of
thread support. The argument provided
returns the support level
provided by the library. The allowed values for provided
and
requested
are NVSHMEM_THREAD_SINGLE
,
NVSHMEM_THREAD_FUNNELED
, NVSHMEM_THREAD_SERIALIZED
, and
NVSHMEM_THREAD_MULTIPLE
.
An NVSHMEM program is initialized either by nvshmem_init
or
nvshmem_init_thread
. Once an NVSHMEM library initialization call has
been performed, a subsequent initialization call in the same program
results in undefined behavior. If the call to nvshmem_init_thread
is
unsuccessful in allocating and initializing resources for the NVSHMEM
library, then the behavior of any subsequent call to the NVSHMEM library
is undefined.
Returns
nvshmem_init_thread
returns 0 upon success; otherwise, it returns a
nonzero value.
Notes
The NVSHMEM library can be initialized either by nvshmem_init
or
nvshmem_init_thread
. If the NVSHMEM library is initialized by
nvshmem_init
, the library implementation can choose to support any
one of the defined thread levels.
The NVSHMEM device APIs always support NVSHMEM_THREAD_MULTIPLE
.
NVSHMEM_QUERY_THREAD¶
-
void
nvshmem_query_thread
(int *provided)¶
- provided [OUT]
- The thread level support provided by the NVSHMEM implementation for host and stream APIs.
Description
The nvshmem_query_thread
call returns the level of thread support
currently being provided. The value returned will be same as was
returned in provided
by a call to nvshmem_init_thread
, if the
NVSHMEM library was initialized by nvshmem_init_thread
. If the
library was initialized by nvshmem_init
, the implementation can
choose to provide any one of the defined thread levels, and
nvshmem_query_thread
returns this thread level.
Returns
None.
Notes
The NVSHMEM device APIs always support NVSHMEM_THREAD_MULTIPLE
.