Communication Abstraction Library usage¶
Communication Abstraction Library (CAL) is a helper module for the cuSOLVERMp library that allows it to efficiently perform communications between different GPUs . The cuSOLVERMp grid creation API accepts cal_comm_t
communicator object and requires it to be created prior to any cuSOLVERMp call.
As for now, CAL supports only the use-case where each participating process uses single GPU and each participating GPU can only be used by a single process.
Communication abstraction library¶
Communications description¶
In order to initialize communicator handle cal_comm_t
you would need to follow bootstrapping process - see respective cal_comm_create() function or example how to create communicator handle with MPI.
The main communication backend used by cuSOLVERMp is modular OpenUCC library. OpenUCC transports modules (such as OpenUCX, NCCL, and others) that will be used at runtime and their configuration can be controlled in various ways (i.e. environment variables that affects NCCL: https://docs.nvidia.com/deeplearning/nccl/user-guide/docs/env.html, environment variables that affects OpenUCX: https://openucx.readthedocs.io/en/master/faq.html) - refer to your platform provider if there are known optimized settings for these dependencies. Otherwise cuSOLVERMp and UCC will use default values.
Based on the nature of underlying communications there are few restrictions to keep in mind when using cuSOLVERMp:
Only one cuSOLVERMp routine can be in the fly at any point of time in the process. It is possible, however, to create and keep multiple communications/cuSOLVERMp handles, but it’s a user’s responsibility to ensure that execution of cuSOLVERMp routines do not overlap.
If you are using NCCL communication library, NCCL collective calls should not overlap with cuSOLVERMp calls to avoid possible deadlocks.
Configuring communications submodule¶
There are few environment variables that can change communication module behavior.
Variable |
Description |
---|---|
CAL_LOG_LEVEL |
Verbosity level of communication module with |
UCC_CONFIG_FILE |
Custom config file for UCC library. It can control underlying transports and collective parameters. Refer to UCC documentation for details on configuration syntax. Default - built-in cuSOLVERMp UCC configuration. |
By default cuSOLVERMp will use UCC configuration file provided in the release package: share/ucc.conf
, library will load it using path relative to the cusolverMp shared object location.
Creating communicator handle with MPI¶
If your application uses Message Passing Interface (MPI) for distributed communications, here how it can be used to bootstrap Communication abstraction library communicator:
#include "cal.h"
#include "cusolverMp.h"
#include "mpi.h"
calError_t allgather(void* src_buf, void* recv_buf, size_t size, void* data, void** request)
{
MPI_Request req;
int err = MPI_Iallgather(src_buf, size, MPI_BYTE, recv_buf, size, MPI_BYTE, (MPI_Comm)data, &req);
if (err != MPI_SUCCESS)
{
return CAL_ERROR;
}
*request = (void*)req;
return CAL_OK;
}
calError_t request_test(void* request)
{
MPI_Request req = (MPI_Request)request;
int completed;
int err = MPI_Test(&req, &completed, MPI_STATUS_IGNORE);
if (err != MPI_SUCCESS)
{
return CAL_ERROR;
}
return completed ? CAL_OK : CAL_ERROR_INPROGRESS;
}
calError_t request_free(void* request)
{
return CAL_OK;
}
calError_t cal_comm_create_mpi(MPI_Comm mpi_comm, int rank, int nranks, int local_device, cal_comm_t* comm)
{
cal_comm_create_params_t params;
params.allgather = allgather;
params.req_test = request_test;
params.req_free = request_free;
params.data = (void*)mpi_comm;
params.rank = rank;
params.nranks = nranks;
params.local_device = local_device;
return cal_comm_create(params, comm);
}
void main()
{
// Initialize MPI, create some distribute data
MPI_Init(...);
int rank, size;
MPI_Comm mpi_comm = MPI_COMM_WORLD;
MPI_Comm_rank(mpi_comm, &rank);
MPI_Comm_size(mpi_comm, &size);
cal_comm_t cal_comm;
// creating communicator handle with MPI communicator
cal_comm_create_mpi(mpi_comm, rank, size, &cal_comm);
// using cuSOLVERMp with cal_comm handle
cusolverMpCreateDeviceGrid(..., cal_comm, ...);
// destroying communicator handle
cal_comm_destroy(cal_comm);
MPI_Finalize();
}
For convenience, these helper functions with MPI are also provided in source form in the release package in src
folder.
Communication abstraction library data types¶
calError_t
¶
Return values from communication abstraction library APIs. The values are described in the table below:
Value |
Description |
---|---|
CAL_OK |
Success. |
CAL_ERROR |
Generic error. |
CAL_ERROR_INVALID_PARAMETER |
Invalid parameter to the interface function. |
CAL_ERROR_INTERNAL |
Invalid error. |
CAL_ERROR_CUDA |
Error in CUDA runtime or driver API. |
CAL_ERROR_IPC |
Error in system IPC communication call. |
CAL_ERROR_UCC |
Error in UCC call. |
CAL_ERROR_NOT_SUPPORTED |
Requested configuration or parameters are not supported. |
CAL_ERROR_BACKEND |
Error in general backend dependency, run with verbose log level to see detailed error message |
CAL_ERROR_INPROGRESS |
Operation is still in progress |
cal_comm_t
¶
cal_comm_t
stores device endpoint and resources related to communication. It must be created and destroyed using cal_comm_create() and cal_comm_destroy() functions respectively.cal_comm_create_params_t
¶
typedef struct cal_comm_create_params
{
calError_t (*allgather)(void* src_buf, void* recv_buf, size_t size, void* data, void** request);
calError_t (*req_test)(void* request);
calError_t (*req_free)(void* request);
void* data;
int nranks;
int rank;
int local_device;
} cal_comm_create_params_t;
cal_comm_create_params_t
structure is a parameter to communication module creation function. This structure must be filled by the user prior to calling cal_comm_create(). Description of the fields for this structure:Field |
Description |
---|---|
allgather |
Pointer to function that implements |
req_test |
If allgather function is asynchronous, this function will be used to query whether or not data was exchanged and can be used by communicator. Should return |
req_free |
If allgather function is asynchronous, this function will be used after the data exchange by |
data |
Pointer to additional data that will be provided to |
nranks |
Number of ranks participating in the communicator that will be created |
rank |
Rank that will be assigned to the caller process in the new communicator. Should be the number between |
local_device |
Local device that will be used by the cusolverMp using this communicator. Note that user should create device context prior to using this device in CAL or cusolverMp calls. |
Communication abstraction library API¶
cal_comm_create
¶
calError_t cal_comm_create(
cal_comm_create_params_t params,
cal_comm_t* new_comm)
cudaSetDevice(device_id); cudaFree(0)
. See cal_comm_create_params_t documentation for instructions on how to fill this structure. You can see how CAL communicator can be created in the MPI example or in the cuSOLVERMp samples.Parameter |
Description |
---|---|
mpi_comm |
Pointer to MPI Communicator that will be used for communicator setup. |
local_device |
Local device id that will be assigned to new communicator. Should be same as device of active context. |
new_comm |
Pointer where to store new communicator handle. |
See calError_t for the description of the return value.
cal_comm_destroy
¶
calError_t cal_comm_destroy(
cal_comm_t comm)
Parameter |
Description |
---|---|
comm |
Communicator handle to release. |
See calError_t for the description of the return value.
cal_stream_sync
¶
calError_t cal_stream_sync(
cal_comm_t comm,
cudaStream_t stream)
stream
. Use this function in place of cudaStreamSynchronize
in order to progress possible outstanding communication operations for the communicator.Parameter |
Description |
---|---|
comm |
Communicator handle. |
stream |
CUDA stream to synchronize. |
See calError_t for the description of the return value.
cal_get_comm_size
¶
calError_t cal_get_comm_size(
cal_comm_t comm,
int* size )
Parameter |
Description |
---|---|
comm |
Communicator handle. |
size |
Number of processing elements. |
See calError_t for the description of the return value.
cal_get_rank
¶
calError_t cal_get_rank(
cal_comm_t comm,
int* rank )
Parameter |
Description |
---|---|
comm |
Communicator handle. |
rank |
Rank Id of the caller process. |
See calError_t for the description of the return value.