Library Setup, Exit, and Query ------------------------------ The library setup and query interfaces that initialize and monitor the parallel environment of the PEs. .. _subsec:shmem_init: **NVSHMEM_INIT** ~~~~~~~~~~~~~~~~ .. c:function:: void nvshmem\_init(void) **Description** ``nvshmem_init`` allocates and initializes resources used by the NVSHMEM library. It is a collective operation that all PEs must call before any other NVSHMEM routine may be called. At the end of the NVSHMEM program which it initialized, the call to ``nvshmem_init`` must be matched with a call to ``nvshmem_finalize``. After the first call to ``nvshmem_init``, a subsequent call to ``nvshmem_init`` or ``nvshmem_init_thread`` in the same program results in undefined behavior. **Returns** None. .. include:: ../../api/init-attr.in .. _subsec:shmem_my_pe: **NVSHMEM_MY_PE** ~~~~~~~~~~~~~~~~~ .. c:function:: int nvshmem\_my\_pe(void) .. c:function:: __device__ int nvshmem\_my\_pe(void) **Description** This routine returns the PE number of the calling PE. It accepts no arguments. The result is an integer between ``0`` and ``npes`` - ``1``, where ``npes`` is the total number of PEs executing the current program. **Returns** Integer - Between ``0`` and ``npes`` - ``1`` .. _subsec:shmem_n_pes: **NVSHMEM_N_PES** ~~~~~~~~~~~~~~~~~ .. c:function:: int nvshmem\_n\_pes(void) .. c:function:: __device__ int nvshmem\_n\_pes(void) **Description** The routine returns the number of PEs running in the program. **Returns** Integer - Number of PEs running in the NVSHMEM program. .. _subsec:shmem_finalize: **NVSHMEM_FINALIZE** ~~~~~~~~~~~~~~~~~~~~ .. c:function:: void nvshmem\_finalize(void) **Description** ``nvshmem_finalize`` is a collective operation that ends the NVSHMEM portion of a program previously initialized by ``nvshmem_init`` or ``nvshmem_init_thread`` and releases all resources used by the NVSHMEM library. This collective operation requires all PEs to participate in the call. There is an implicit global barrier in ``nvshmem_finalize`` to ensure that pending communications are completed and that no resources are released until all PEs have entered ``nvshmem_finalize``. ``nvshmem_finalize`` must be the last NVSHMEM library call encountered in the NVSHMEM portion of a program. A call to ``nvshmem_finalize`` will release all resources initialized by a corresponding call to ``nvshmem_init`` or ``nvshmem_init_thread``. All processes that represent the PEs will still exist after the call to ``nvshmem_finalize`` returns, but they will no longer have access to resources that have been released. **Returns** None. **Notes** ``nvshmem_finalize`` releases all resources used by the NVSHMEM library including the symmetric memory heap and pointers initiated by ``nvshmem_ptr``. This collective operation requires all PEs to participate in the call, not just a subset of the PEs. The non-NVSHMEM portion of a program may continue after a call to ``nvshmem_finalize`` by all PEs. .. _subsec:shmem_ptr: **NVSHMEM_PTR** ~~~~~~~~~~~~~~~ .. c:function:: void *nvshmem\_ptr(const void *dest, int pe) .. c:function:: __device__ void *nvshmem\_ptr(const void *dest, int pe) *dest [IN]* The symmetric address of the remotely accessible data object to be referenced. *pe [IN]* An integer that indicates the PE number on which ``dest`` is to be accessed. **Description** ``nvshmem_ptr`` returns an address that may be used to directly reference ``dest`` on the specified PE. This address can be assigned to a pointer. After that, ordinary loads and stores to ``dest`` may be performed. The address returned by ``nvshmem_ptr`` is a local address to a remotely accessible data object. Providing this address to an argument of an NVSHMEM routine that requires a symmetric address results in undefined behavior. The ``nvshmem_ptr`` routine can provide an efficient means to accomplish communication, for example when a sequence of reads and writes to a data object on a remote PE does not match the access pattern provided in an NVSHMEM data transfer routine like ``nvshmem_put`` or ``nvshmem_iget``. **Returns** A local pointer to the remotely accessible ``dest`` data object is returned when it can be accessed using memory loads and stores. Otherwise, a null pointer is returned. **Notes** When calling ``nvshmem_ptr``, ``dest`` is the address of the referenced symmetric data object on the calling PE. .. _subsec:shmem_info_get_version: **NVSHMEM_INFO_GET_VERSION** ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. c:function:: void nvshmem\_info\_get\_version(int *major, int *minor) .. c:function:: __device__ void nvshmem\_info\_get\_version(int *major, int *minor) *major [OUT]* The major version of the OpenSHMEM Specification in use. *minor [OUT]* The minor version of the OpenSHMEM Specification in use. **Description** This routine returns the major and minor version of the OpenSHMEM Specification in use. For a given library implementation, the major and minor version returned by these calls are consistent with the library constants ``NVSHMEM_MAJOR_VERSION`` and ``SHMEM_MINOR_VERSION``. **Returns** None. .. _subsec:shmem_info_get_name: **NVSHMEM_INFO_GET_NAME** ~~~~~~~~~~~~~~~~~~~~~~~~~ .. c:function:: void nvshmem\_info\_get\_name(char *name) .. c:function:: __device__ void nvshmem\_info\_get\_name(char *name) *name [OUT]* The vendor defined string. **Description** This routine returns the vendor defined name string of size defined by the library constant ``NVSHMEM_MAX_NAME_LEN``. The program calling this function provides the ``name`` memory buffer of at least size ``NVSHMEM_MAX_NAME_LEN``. The implementation copies the vendor defined string of size at most ``NVSHMEM_MAX_NAME_LEN`` to ``name``. In *C/C++*, the string is terminated by a null character. If the ``name`` memory buffer is provided with size less than ``NVSHMEM_MAX_NAME_LEN``, behavior is undefined. For a given library implementation, the vendor string returned is consistent with the library constant ``NVSHMEM_VENDOR_STRING``. **Returns** None. .. _subsec:thread_support: Thread Support -------------- This section specifies the interaction between the NVSHMEM interfaces and user threads. It also describes the routines that can be used for initializing and querying the thread environment. There are four levels of threading defined by the NVSHMEM specification. ``NVSHMEM_THREAD_SINGLE`` The NVSHMEM program must not be multithreaded. ``NVSHMEM_THREAD_FUNNELED`` The NVSHMEM program may be multithreaded. However, the program must ensure that only the main thread invokes the NVSHMEM interfaces. The main thread is the thread that invokes either ``nvshmem_init`` or ``nvshmem_init_thread``. ``NVSHMEM_THREAD_SERIALIZED`` The NVSHMEM program may be multithreaded. However, the program must ensure that the NVSHMEM interfaces are not invoked concurrently by multiple threads. ``NVSHMEM_THREAD_MULTIPLE`` The NVSHMEM program may be multithreaded and any thread may invoke the NVSHMEM interfaces. The thread level constants must have increasing integer values; i.e., ``NVSHMEM_THREAD_SINGLE`` < ``SHMEM_THREAD_FUNNELED`` < ``NVSHMEM_THREAD_SERIALIZED`` < ``SHMEM_THREAD_MULTIPLE``. The following semantics apply to the usage of these models: #. In the ``NVSHMEM_THREAD_FUNNELED``, ``SHMEM_THREAD_SERIALIZED``, and ``NVSHMEM_THREAD_MULTIPLE`` thread levels, the ``nvshmem_init`` and ``nvshmem_finalize`` calls must be invoked by the same thread. #. Any NVSHMEM operation initiated by a thread is considered an action of the PE as a whole. The symmetric heap and symmetric variables scope are not impacted by multiple threads invoking the NVSHMEM interfaces. Each PE has a symmetric heap that is shared by all threads within that PE. For example, a thread invoking a memory allocation routine such as ``nvshmem_malloc`` allocates memory that is accessible by all threads of the PE. The requirement that the same symmetric heap operations must be executed by all PEs in the same order also applies in a threaded environment. Similarly, the completion of collective operations is not impacted by multiple threads. For example, ``nvshmem_barrier_all`` is completed when all PEs enter and exit the ``nvshmem_barrier_all`` call, even though only one thread in the PE is participating in the collective call. #. Blocking NVSHMEM calls will only block the calling thread, allowing other threads, if available, to continue executing. The calling thread will be blocked until the event on which it is waiting occurs. Once the blocking call is completed, the thread is ready to continue execution. A blocked thread will not prevent progress of other threads on the same PE and will not prevent them from executing other NVSHMEM calls when the thread level permits. In addition, a blocked thread will not prevent the progress of NVSHMEM calls performed on other PEs. #. In the ``NVSHMEM_THREAD_MULTIPLE`` thread level, all NVSHMEM calls are thread-safe. That is, any two concurrently running threads may make NVSHMEM calls. #. In the ``NVSHMEM_THREAD_SERIALIZED`` and ``SHMEM_THREAD_MULTIPLE`` thread levels, if multiple threads call collective routines, including the symmetric heap management routines, it is the programmer’s responsibility to ensure the correct ordering of collective calls. Note that NVSHMEM does not currently support multiple threads performing concurrent collective operations, including the NVSHMEM collective memory management operations. .. _subsec:shmem_init_thread: **NVSHMEM_INIT_THREAD** ~~~~~~~~~~~~~~~~~~~~~~~ .. c:function:: int nvshmem\_init\_thread(int requested, int *provided) *requested [IN]* The thread level support requested by the user. *provided [OUT]* The thread level support provided by the NVSHMEM implementation. **Description** ``nvshmem_init_thread`` initializes the NVSHMEM library in the same way as ``nvshmem_init``. In addition, ``nvshmem_init_thread`` also performs the initialization required for supporting the provided thread level. The argument ``requested`` is used to specify the desired level of thread support. The argument ``provided`` returns the support level provided by the library. The allowed values for ``provided`` and ``requested`` are ``NVSHMEM_THREAD_SINGLE``, ``SHMEM_THREAD_FUNNELED``, ``NVSHMEM_THREAD_SERIALIZED``, and ``SHMEM_THREAD_MULTIPLE``. An NVSHMEM program is initialized either by ``nvshmem_init`` or ``nvshmem_init_thread``. Once an NVSHMEM library initialization call has been performed, a subsequent initialization call in the same program results in undefined behavior. If the call to ``nvshmem_init_thread`` is unsuccessful in allocating and initializing resources for the NVSHMEM library, then the behavior of any subsequent call to the NVSHMEM library is undefined. **Returns** ``nvshmem_init_thread`` returns 0 upon success; otherwise, it returns a nonzero value. **Notes** The NVSHMEM library can be initialized either by ``nvshmem_init`` or ``nvshmem_init_thread``. If the NVSHMEM library is initialized by ``nvshmem_init``, the library implementation can choose to support any one of the defined thread levels. .. _subsec:shmem_query_thread: **NVSHMEM_QUERY_THREAD** ~~~~~~~~~~~~~~~~~~~~~~~~ .. c:function:: void nvshmem\_query\_thread(int *provided) *provided [OUT]* The thread level support provided by the NVSHMEM implementation. **Description** The ``nvshmem_query_thread`` call returns the level of thread support currently being provided. The value returned will be same as was returned in ``provided`` by a call to ``nvshmem_init_thread``, if the NVSHMEM library was initialized by ``nvshmem_init_thread``. If the library was initialized by ``nvshmem_init``, the implementation can choose to provide any one of the defined thread levels, and ``nvshmem_query_thread`` returns this thread level. **Returns** None.