NVIDIA Communication Abstraction Library (CAL) APIs

This section describes the Fortran interfaces to the CAL library. This library contains a small number of functions used in conjunction with other multi-processor libraries, for initializing a communications layer used for handling distributed matrices and arrays.

The CAL interfaces and definitions described in this chapter can be exposed in host code by adding the line

use nvf_cal_comm

to your program unit, but since the module definitions are also used within other modules in this document, it is suggested you access them through the higher-level modules, such as

use cublasMp

or

use cusolverMp

.

CAL API

This section describes the parameters and derived types defined in the CAL module.

! Definitions from cal.h
integer, parameter :: CAL_VER_MAJOR = 0
integer, parameter :: CAL_VER_MINOR = 4
integer, parameter :: CAL_VER_PATCH = 2
integer, parameter :: CAL_VER_BUILD = 0
integer, parameter :: CAL_VERSION = &
        (CAL_VER_MAJOR * 1000 + CAL_VER_MINOR * 100 + CAL_VER_PATCH)
enum, bind(c)
    enumerator :: CAL_OK = 0                       ! Success
    enumerator :: CAL_ERROR_INPROGRESS = 1         ! Request is in progress
    enumerator :: CAL_ERROR = 2                    ! Generic error
    enumerator :: CAL_ERROR_INVALID_PARAMETER = 3  ! Invalid parameter to the interface function.
    enumerator :: CAL_ERROR_INTERNAL = 4           ! Internal error
    enumerator :: CAL_ERROR_CUDA = 5               ! Error in CUDA runtime/driver API
    enumerator :: CAL_ERROR_UCC = 6                ! Error in UCC call
    enumerator :: CAL_ERROR_NOT_SUPPORTED = 7      ! Requested configuration not supported
end enum
! Types from cal.h
    TYPE cal_comm
      TYPE(C_PTR) :: handle
    END TYPE cal_comm

cal_comm_create_mpi

CAL is a communications library used by cublasMp and other multi-processor libraries. It uses MPI for initialization and potentially other uses. This is a convenience function provided with Fortran to initialize the cal communicator. Because of the dependence on MPI, source code for some cal Fortran wrappers are shipped in the NVHPC package, and can be built with the MPI headers used in your application. The version we ship works with the default MPI bundled in the NVHPC package. The cal communicator derived type output by this function is an input to cublasMpGridCreate(), cusolverMpCreateDeviceGrid() and similar functions.

integer(4) function cal_comm_create_mpi(mpi_comm, &
      rank, nranks, local_device, comm)
  integer(4), intent(in) :: mpi_comm, rank, nranks, local_device
  type(cal_comm), intent(out) :: comm

cal_comm_destroy

This function destroys the cal_comm data structure and frees the resources associated with it.

integer(4) function cal_comm_destroy(comm)
  type(cal_comm) :: comm

cal_stream_sync

This function blocks the calling thread until all outstanding device operations, including cal operations, are finished in the specified stream.

integer(4) function cal_stream_sync(comm, stream)
  type(cal_comm) :: comm
  integer(cuda_stream_kind) :: stream

cal_comm_barrier

This function synchronizes streams from all processes in the cal communicator, basically an all-to-all synchronization.

integer(4) function cal_comm_barrier(comm, stream)
  type(cal_comm) :: comm
  integer(cuda_stream_kind) :: stream

cal_comm_get_rank

This function returns the rank of the calling thread in the CAL communicator.

integer(4) function cal_comm_get_rank(comm, rank)
  type(cal_comm) :: comm
  integer(4) :: rank

cal_comm_get_size

This function returns the size of the CAL communicator.

integer(4) function cal_comm_get_size(comm, size)
  type(cal_comm) :: comm
  integer(4) :: size