Building and Linking

Overview

NVPL BLAS library consists of two layers:

  • Interface layer: this layer provides different capabilities to the users, such as different integer width, and different threading runtime support. There are multiple interface libraries, each represents a particular configuration. Only one interface library should be linked in the application.

  • Core layer: a single library that contains optimized implementations of the functionality.

All interface libraries depend on the core library. When using dynamic libraries it is enough to link against the interface library only, as it explicitly depends on the core library.

Quick Examples

To build an application using LP64 sequential NVPL BLAS:

gcc app.c -c -o app.o
gcc app.o -lnvpl_blas_lp64_seq

To build an application using ILP64 threaded NVPL BLAS:

gcc app.c -c -o app.o -DNVPL_ILP64
gcc app.o -lnvpl_blas_ilp64_gomp -lgomp

LP64 and ILP64 Interfaces

NVPL BLAS supports two integer types in F77 BLAS and CBLAS APIs:

  • 32-bit wide (aka LP64 interface), and

  • 64-bit wide (aka ILP64 interface).

Switching between the interfaces requires using proper compiler flags and linking against proper NVPL BLAS interface library.

Compilation: C / C++

C APIs use nvpl_int_t type for integer parameters, that is expanded to either int32_t or int64_t (or equivalent) depending on absence or presence of NVPL_ILP64 macro.

Compilation: Fortran

To use ILP64 NVPL BLAS interface use -fdefault-integer-8 compiler flag, or equivalent.

Linking

Since both LP64 and ILP64 interfaces have the same function names, NVPL BLAS provides different libraries for these cases. Make sure you link against the proper library, otherwise at runtime you might get incorrect results or application crashes.

  • To use LP64 interface, link libnvpl_blas_lp64_* interface library.

  • To use ILP64 interface, link libnvpl_blas_ilp64_* interface library.

Threading Interfaces

NVPL BLAS has sequential and threaded (using OpenMP) interfaces of the library. A user should link their applications or libraries to the preferred option.

Sequential

The sequential interface doesn’t depend on any threading runtime, and all the functions only use the calling thread to perform the computations. To use it, link libnvpl_blas_*_seq interface library.

OpenMP-based Threading

The OpenMP-based threading interface is based on GNU OpenMP runtime. To use it, link libnvpl_blas_*_gomp interface library.

The following runtimes are ABI compatible with GNU OpenMP runtime:

  • GNU OpenMP runtime: libgomp.so.1.

  • LLVM OpenMP runtime: libomp.so, libomp.so.5, etc.

  • NVIDIA’s OpenMP runtime: libnvomp.so.

The ABI compatibility here means that an application or a library built with GNU OpenMP, including NVPL BLAS OpenMP-based threading interface, works transparently with these other runtimes.

Since different OpenMP runtimes use different library names, NVPL BLAS OpenMP-based threading interface doesn’t explicitly depend on any of them. Instead, the library implements lazy dynamic symbol resolution.

  • Lazy means that the symbol resolution happens at run time on the first use.

  • Dynamic means that the symbols are resolved using dlopen() and dlsym() APIs (or analogues).

    • NVPL BLAS first tries to look for OpenMP symbols in the current process’s address space, and, if fails, it loads (dlopen()) the default runtime, which is GNU OpenMP.

Warning

Since NVPL BLAS OpenMP-based threading interface does not explicitly depend on any particular OpenMP runtime (with the default being GNU OpenMP runtime), it is strongly recommended to always link the appropriate OpenMP runtime to the final application or a library that uses NVPL BLAS. This will ensure the approriate OpenMP runtime is loaded during the execution.