Overview¶
Software Components¶
ScaLAPACK aims to provide both efficiency and portability. This goal is achieved by separating low-level machine dependent components while keeping the overall implementation similar to the sequential version of the code. In general, BLAS and BLACS are considered as the low-level components that needs to be optimized for different computing platforms.
ScaLAPACK: parallel linear algebra operations.
PBLAS: parallel version of BLAS.
BLACS: MPI communications for linear algebra operations.
Dependencies: MPI, BLAS and LAPACK.
NVPL ScaLAPACK with two dynamics libraries and its variants:
libnvpl_scalapack_<int_type>.so
contains ScaLAPACK API, PBLAS, and redistribution tools;
libnvpl_blacs_<int_type>_<mpi_type>.so
supports multiple MPI implementations.
A typical installation directory structure is shown in the below without versions:
$ tree
.
├── LICENSE
├── include
│ ├── nvpl_scalapack.h
│ ├── nvpl_scalapack_blacs.h
│ ├── nvpl_scalapack_cblacs.h
│ ├── nvpl_scalapack_pblas.h
│ ├── nvpl_scalapack_types.h
│ └── nvpl_scalapack_version.h
└── lib
├── cmake
│ └── nvpl_scalapack
│ ├── nvpl_scalapack-config-version.cmake
│ ├── nvpl_scalapack-config.cmake
│ ├── nvpl_scalapack-targets-release.cmake
│ └── nvpl_scalapack-targets.cmake
├── libnvpl_blacs_ilp64_mpich.so
├── libnvpl_blacs_ilp64_openmpi3.so
├── libnvpl_blacs_ilp64_openmpi4.so
├── libnvpl_blacs_ilp64_openmpi5.so
├── libnvpl_blacs_lp64_mpich.so
├── libnvpl_blacs_lp64_openmpi3.so
├── libnvpl_blacs_lp64_openmpi4.so
├── libnvpl_blacs_lp64_openmpi5.so
├── libnvpl_scalapack_ilp64.so
└── libnvpl_scalapack_lp64.so
Building and Linking¶
For convenience, we assume that environments of nvpl_ROOT
and MPI_ROOT
are set to the NVPL and MPI installation directories. The following example illustrates how to build an application with the lp64
integer type and interface to sequential BLAS and LAPACK.
### using MPI compiler wrapper
$ mpicxx -I${nvpl_ROOT}/include app.cpp -c -o app.cpp.o
$ mpicxx -o app app.cpp.o \
-L${nvpl_ROOT}/lib -lnvpl_scalapack_lp64 \
-lnvpl_blacs_lp64_openmpi4 \
-lnvpl_lapack_lp64_seq \
-lnvpl_blas_lp64_seq
### or, without MPI compiler wrapper
$ g++ -I${nvpl_ROOT}/include -DOMPI_SKIP_MPICXX app.cpp -c -o app.cpp.o
$ g++ -o app app.cpp.o \
-L${nvpl_ROOT}/lib -lnvpl_scalapack_lp64 \
-lnvpl_blacs_lp64_openmpi4 \
-lnvpl_lapack_lp64_seq \
-lnvpl_blas_lp64_seq \
-L${MPI_ROOT}/lib -lmpi
An example of building the application with the ilp64
integer type and interfacing to parallel version of BLAS and LAPACK is shown in the below.
### using MPI compiler wrapper
$ mpicxx -DNVPL_ILP64 -I${nvpl_ROOT}/include app.cpp -c -o app.cpp.o
$ mpicxx -o app app.cpp.o \
-L${nvpl_ROOT}/lib -lnvpl_scalapack_ilp64 \
-lnvpl_blacs_ilp64_openmpi4 \
-lnvpl_lapack_ilp64_gomp \
-lnvpl_blas_ilp64_gomp
### or, without MPI compiler wrapper
$ g++ -DNVPL_ILP64 -I${nvpl_ROOT}/include -DOMPI_SKIP_MPICXX app.cpp -c -o app.cpp.o
$ g++ -o app app.cpp.o \
-L${nvpl_ROOT}/lib -lnvpl_scalapack_ilp64 \
-lnvpl_blacs_ilp64_openmpi4 \
-lnvpl_lapack_ilp64_gomp \
-lnvpl_blas_ilp64_gomp \
-L${MPI_ROOT}/lib -lmpi \
-lgomp
To run the code, ensure that LD_LIBRARY_PATH
includes ${nvpl_ROOT}/lib
and ${MPI_ROOT}/lib
.
LP64 and ILP64 Interfaces¶
NVPL ScaLAPACK supports two integer types:
32-bit wide (aka LP64 interface), and
64-bit wide (aka ILP64 interface).
To interface a specific integer variant of NVPL BLAS and NVPL LAPACK, a proper compiler flag should be used as shown in the below.
Language |
Compiler Flags for ILP64 |
Description |
---|---|---|
C/C++ |
|
extend |
Fortran |
|
set the default integer/logical type to 8 byte |
Enabling OpenMP¶
OpenMP is enabled based on GNU OpenMP runtime. To use the parallel BLAS and LAPACK, link against libnvpl_lapack_<int_type>_gomp.so
and libnvpl_blas_<int_type>_gomp.so
libraries.
The following OpenMP runtimes are ABI compatible to the GNU OpenMP:
GNU OpenMP runtime
libgomp.so.1
,LLVM OpenMP runtime
libomp.so
,libomp.so.5
, etc.,NVIDIA OpenMP runtime
libnvomp.so
.