Overview

Software Components

ScaLAPACK aims to provide both efficiency and portability. This goal is achieved by separating low-level machine dependent components while keeping the overall implementation similar to the sequential version of the code. In general, BLAS and BLACS are considered as the low-level components that needs to be optimized for different computing platforms.

  • ScaLAPACK: parallel linear algebra operations.

  • PBLAS: parallel version of BLAS.

  • BLACS: MPI communications for linear algebra operations.

  • Dependencies: MPI, BLAS and LAPACK.

NVPL ScaLAPACK with two dynamics libraries and its variants:

  • libnvpl_scalapack_<int_type>.so contains ScaLAPACK API, PBLAS, and redistribution tools;

  • libnvpl_blacs_<int_type>_<mpi_type>.so supports multiple MPI implementations.

A typical installation directory structure is shown in the below without versions:

$ tree
.
├──   LICENSE
├──   include
│   ├──   nvpl_scalapack.h
│   ├──   nvpl_scalapack_blacs.h
│   ├──   nvpl_scalapack_cblacs.h
│   ├──   nvpl_scalapack_pblas.h
│   ├──   nvpl_scalapack_types.h
│   └──   nvpl_scalapack_version.h
└──   lib
    ├──   cmake
       └──   nvpl_scalapack
           ├──   nvpl_scalapack-config-version.cmake
           ├──   nvpl_scalapack-config.cmake
           ├──   nvpl_scalapack-targets-release.cmake
           └──   nvpl_scalapack-targets.cmake
    ├──   libnvpl_blacs_ilp64_mpich.so
    ├──   libnvpl_blacs_ilp64_openmpi3.so
    ├──   libnvpl_blacs_ilp64_openmpi4.so
    ├──   libnvpl_blacs_ilp64_openmpi5.so
    ├──   libnvpl_blacs_lp64_mpich.so
    ├──   libnvpl_blacs_lp64_openmpi3.so
    ├──   libnvpl_blacs_lp64_openmpi4.so
    ├──   libnvpl_blacs_lp64_openmpi5.so
    ├──   libnvpl_scalapack_ilp64.so
    └──   libnvpl_scalapack_lp64.so

Building and Linking

For convenience, we assume that environments of nvpl_ROOT and MPI_ROOT are set to the NVPL and MPI installation directories. The following example illustrates how to build an application with the lp64 integer type and interface to sequential BLAS and LAPACK.

### using MPI compiler wrapper
$ mpicxx -I${nvpl_ROOT}/include app.cpp -c -o app.cpp.o
$ mpicxx -o app app.cpp.o \
    -L${nvpl_ROOT}/lib -lnvpl_scalapack_lp64 \
                       -lnvpl_blacs_lp64_openmpi4 \
                       -lnvpl_lapack_lp64_seq \
                       -lnvpl_blas_lp64_seq

### or, without MPI compiler wrapper
$ g++ -I${nvpl_ROOT}/include -DOMPI_SKIP_MPICXX app.cpp -c -o app.cpp.o
$ g++ -o app app.cpp.o \
    -L${nvpl_ROOT}/lib -lnvpl_scalapack_lp64 \
                       -lnvpl_blacs_lp64_openmpi4 \
                       -lnvpl_lapack_lp64_seq \
                       -lnvpl_blas_lp64_seq \
    -L${MPI_ROOT}/lib  -lmpi

An example of building the application with the ilp64 integer type and interfacing to parallel version of BLAS and LAPACK is shown in the below.

### using MPI compiler wrapper
$ mpicxx -DNVPL_ILP64 -I${nvpl_ROOT}/include app.cpp -c -o app.cpp.o
$ mpicxx -o app app.cpp.o \
    -L${nvpl_ROOT}/lib -lnvpl_scalapack_ilp64 \
                       -lnvpl_blacs_ilp64_openmpi4 \
                       -lnvpl_lapack_ilp64_gomp \
                       -lnvpl_blas_ilp64_gomp

### or, without MPI compiler wrapper
$ g++ -DNVPL_ILP64 -I${nvpl_ROOT}/include -DOMPI_SKIP_MPICXX app.cpp -c -o app.cpp.o
$ g++ -o app app.cpp.o \
    -L${nvpl_ROOT}/lib -lnvpl_scalapack_ilp64 \
                       -lnvpl_blacs_ilp64_openmpi4 \
                       -lnvpl_lapack_ilp64_gomp \
                       -lnvpl_blas_ilp64_gomp \
    -L${MPI_ROOT}/lib  -lmpi \
                       -lgomp

To run the code, ensure that LD_LIBRARY_PATH includes ${nvpl_ROOT}/lib and ${MPI_ROOT}/lib.

LP64 and ILP64 Interfaces

NVPL ScaLAPACK supports two integer types:

  • 32-bit wide (aka LP64 interface), and

  • 64-bit wide (aka ILP64 interface).

To interface a specific integer variant of NVPL BLAS and NVPL LAPACK, a proper compiler flag should be used as shown in the below.

Language

Compiler Flags for ILP64

Description

C/C++

-DNVPL_ILP64

extend nvpl_int_t to int64_t

Fortran

-fdefault-integer-8 for GNU

set the default integer/logical type to 8 byte

Enabling OpenMP

OpenMP is enabled based on GNU OpenMP runtime. To use the parallel BLAS and LAPACK, link against libnvpl_lapack_<int_type>_gomp.so and libnvpl_blas_<int_type>_gomp.so libraries.

The following OpenMP runtimes are ABI compatible to the GNU OpenMP:

  • GNU OpenMP runtime libgomp.so.1,

  • LLVM OpenMP runtime libomp.so, libomp.so.5, etc.,

  • NVIDIA OpenMP runtime libnvomp.so.