Building and Linking#
Overview#
NVPL BLAS library consists of two layers:
Interface layer: this layer provides different capabilities to the users, such as different integer width, and different threading runtime support. There are multiple interface libraries, each represents a particular configuration. Only one interface library should be linked in the application.
Core layer: a single library that contains optimized implementations of the functionality.
All interface libraries depend on the core library. When using dynamic libraries it is enough to link against the interface library only, as it explicitly depends on the core library.
Quick Examples#
To build an application using LP64 sequential NVPL BLAS:
gcc app.c -c -o app.o
gcc app.o -lnvpl_blas_lp64_seq
To build an application using ILP64 threaded NVPL BLAS:
gcc app.c -c -o app.o -DNVPL_ILP64
gcc app.o -lnvpl_blas_ilp64_gomp -lgomp
LP64 and ILP64 Interfaces#
NVPL BLAS supports two integer types in F77 BLAS and CBLAS APIs:
32-bit wide (aka LP64 interface), and
64-bit wide (aka ILP64 interface).
Switching between the interfaces requires using proper compiler flags and linking against proper NVPL BLAS interface library.
Compilation: C / C++#
C APIs use nvpl_int_t type for integer parameters, that is expanded to
either int32_t or int64_t (or equivalent) depending on absence or
presence of NVPL_ILP64 macro.
Compilation: Fortran#
To use ILP64 NVPL BLAS interface use -fdefault-integer-8 compiler flag, or
equivalent.
Linking#
Since both LP64 and ILP64 interfaces have the same function names, NVPL BLAS provides different libraries for these cases. Make sure you link against the proper library, otherwise at runtime you might get incorrect results or application crashes.
To use LP64 interface, link
libnvpl_blas_lp64_*interface library.To use ILP64 interface, link
libnvpl_blas_ilp64_*interface library.
Threading Interfaces#
NVPL BLAS has sequential and threaded (using OpenMP) interfaces of the library. A user should link their applications or libraries to the preferred option.
Sequential#
The sequential interface doesn’t depend on any threading runtime, and all the
functions only use the calling thread to perform the computations.
To use it, link libnvpl_blas_*_seq interface library.
OpenMP-based Threading#
The OpenMP-based threading interface is based on GNU OpenMP runtime.
To use it, link libnvpl_blas_*_gomp interface library.
The following runtimes are ABI compatible with GNU OpenMP runtime:
GNU OpenMP runtime:
libgomp.so.1.LLVM OpenMP runtime:
libomp.so,libomp.so.5, etc.NVIDIA’s OpenMP runtime:
libnvomp.so.
The ABI compatibility here means that an application or a library built with GNU OpenMP, including NVPL BLAS OpenMP-based threading interface, works transparently with these other runtimes.
Since different OpenMP runtimes use different library names, NVPL BLAS OpenMP-based threading interface doesn’t explicitly depend on any of them. Instead, the library implements lazy dynamic symbol resolution.
Lazy means that the symbol resolution happens at run time on the first use.
Dynamic means that the symbols are resolved using
dlopen()anddlsym()APIs (or analogues).NVPL BLAS first tries to look for OpenMP symbols in the current process’s address space, and, if fails, it loads (
dlopen()) the default runtime, which is GNU OpenMP.
Warning
Since NVPL BLAS OpenMP-based threading interface does not explicitly depend on any particular OpenMP runtime (with the default being GNU OpenMP runtime), it is strongly recommended to always link the appropriate OpenMP runtime to the final application or a library that uses NVPL BLAS. This will ensure the appropriate OpenMP runtime is loaded during the execution.
See also
Using CMake#
NVPL BLAS can be linked in CMake using either NVPL CMake package config or using FindBLAS module.
Using NVPL CMake Package Config#
NVPL provides native CMake package configuration files that allow direct integration into CMake projects. This is the recommended approach as it provides access to specific NVPL BLAS targets for different interface and threading configurations.
See also
To use NVPL BLAS in your CMake project, use find_package(nvpl) and link
against one of the available targets:
cmake_minimum_required(VERSION 3.18)
project(MyApp C)
find_package(nvpl REQUIRED)
add_executable(myapp app.c)
# Link NVPL BLAS LP64 interface with OpenMP threading
target_link_libraries(myapp PRIVATE nvpl::blas_lp64_gomp)
Available targets include nvpl::blas_lp64_seq, nvpl::blas_lp64_gomp,
nvpl::blas_ilp64_seq, and nvpl::blas_ilp64_gomp.
Using CMake FindBLAS Module#
Starting with CMake 4.1, the FindBLAS module has built-in support for
detecting and linking NVPL BLAS. This provides a portable way to integrate
NVPL BLAS into CMake-based projects.
See also
Set the following configuration variables as needed:
BLA_VENDOR- Set toNVPLto use NVIDIA Performance LibrariesBLA_SIZEOF_INTEGER- Set to4for LP64 or8for ILP64BLA_THREAD- Set toSEQfor sequential orOMPfor OpenMP threading
Then use find_package(BLAS REQUIRED) to add BLAS::BLAS target.
A complete example:
cmake_minimum_required(VERSION 4.1)
project(MyApp C)
# Select NVPL BLAS ILP64 interface with OpenMP
set(BLA_VENDOR NVPL)
set(BLA_SIZEOF_INTEGER 8)
set(BLA_THREAD OMP)
find_package(BLAS REQUIRED)
# Find OpenMP for the application
find_package(OpenMP REQUIRED)
add_executable(myapp app.c)
# Define NVPL_ILP64 macro for ILP64 interface
target_compile_definitions(myapp PRIVATE NVPL_ILP64)
# Link BLAS and OpenMP
target_link_libraries(myapp PRIVATE BLAS::BLAS OpenMP::OpenMP_C)