Build System#

CMake Scripts#

PVA SDK comes with some CMake scripts to allow building VPU ELFs as well as corresponding host code. The build scripts also support generating monolithic binaries which may be executed in the target host environment and contain the required device code as embedded data.

Installing the PVA SDK makes the pva-sdk CMake package available system-wide. To use this in a CMake project, the CMakeLists.txt file should look as follows:

cmake_minimum_required(VERSION 3.22)
find_package(pva-sdk REQUIRED)
project(<project name>)

Note that the find_package directive occurs before the project directive. This is optional if you are specifying your own toolchain with the CMake_TOOLCHAIN_FILE option, but required if you’d like to use the toolchain files distributed with the SDK.

Global Options#

The full set of build options which the PVA SDK build system accepts are listed below. These can be passed to CMake as options: -D\<VAR\>=VAL, or as environment variables the first time CMake is invoked. Since these values are cached at CMake configuration time, updating your environment after running CMake does not update the respective cached values. You can re-run CMake and pass the option explicitly with -D to update the cached value, or clean the CMake cache and re-run CMake. It is recommended that you use a different build directory to build a different configuration.

PVA_GEN2_ASIP_PATH — Path to Synopsys ASIP tools to use for GEN2. Defaults to the install path of the ASIP programmer package distributed by NVIDIA.
PVA_GEN3_ASIP_PATH — Path to Synopsys ASIP tools to use for GEN3. Defaults to the install path of the ASIP programmer package distributed by NVIDIA.
PVA_DEFAULT_GEN — PVA Generation(s) to target if not explicitly provided with pva_device (defaults to ALL)
PVA_SAFETY — Whether to link with the safety-certified PVA libraries. Defaults to OFF.
PVA_BUILD_MODE — Build mode. Valid options are NATIVE, QNX, L4T, or SIM. Defaults to NATIVE.
PVA_USE_CCACHE — Whether to use ccache. Defaults to ON.
PVA_PROFILING_MODE — ON causes the VPU cycle counter to continue incrementing while waiting on the DMA signal. Defaults to OFF.
PVA_VPU_SDK — Specify the name of the VPU SDK, if running on a system where multiple flavors are available.
PVA_ONLY_BUILD_GENS — Allows users to build for a single PVA generation. This prevents device code for all other generations being built, even if they are flagged as supported in pva_device or PVA_DEFAULT_GEN.
PVA_DISASSEMBLE — Output human readable machine code mnemonics (extension .lst) for device code objects and executables.

Note

For compatibility reasons, all of the above options can also be specified with cuPVA as prefix instead of PVA.

Using ASIP Tools#

In some modes, PVA SDK build scripts invokes ASIP programmer to compile VPU executables. This requires a valid installation of the appropriate ASIP programmer version. NVIDIA provides Debian packages of ASIP programmer which installs by default to /opt/nvidia/pva-sdk/asip-programmer-<version>. The build system searches in these paths by default. If this is not the desired behavior, or if you are not using NVIDIA packages of ASIP programmer, the options PVA_<GENX>_ASIP_PATH should be set (see above).

Additionally, PVA SDK build scripts assume that the build environment is appropriately configured for FlexLM licensing with Synopsys products. This usually requires setting one or both of the LM_LICENSE_FILE or SNPSLMD_LICENSE_FILE environment variables.

LM_LICENSE_FILE can be used to specify a file containing multiple licensing server options, while SNPSLMD_LICENSE_FILE can be used to explicitly specify licening servers. Specifying SNPSLMD_LICENSE_FILE instead of LM_LICENSE_FILE can provide some improvement in compilation speed as a smaller set of servers need to be queried to find a valid license.

Some example usage of LM_LICENSE_FILE is as follows:

export LM_LICENSE_FILE=<PATH_TO_LICENSE_FILE>

An example usage of SNPSLMD_LICENSE_FILE is as follows:

export SNPSLMD_LICENSE_FILE=<TCP_PORT>@<LICENSE_SERVER>

For detailed information, refer to the Synopsys or FlexLM documentation.

Cross-Compilation#

PVA SDK CMake scripts are distributed with default toolchain files which can be used to build for SOCs running Linux and QNX operating systems. The toolchain files control compilation of host code, but do not affect compilation of VPU code. Toolchain files are selected based on the value of PVA_BUILD_MODE, and may be overridden explicitly by the user using CMake_TOOLCHAIN_FILE. Refer to CMake documentation for more information.

Function Reference#

The following CMake functions are exposed after successfully calling find_package(pva-sdk).

add_executable_pva

add_executable_pva(name
                 HOST <host src files>
                 DEVICE <device_target_1> <device_target_2>...
                 [STATIC_HOST] (Force linking target <name> against static cuPVA host library.
                                Without this option specified, target <name> will link to dynamic cuPVA host library.)
)

Build a PVA executable. Host source files are added to a standard CMake target named <name>. Device targets are embedded in the target as data blobs. Device targets should be declared with pva_device or pva_device_import.

add_library_pva

add_library_pva(name
                HOST <host src files>
                DEVICE <device_target_1> <device_target_2>...
                [STATIC_HOST] (Force linking target <name> against static cuPVA host library.
                               Without this option specified, target <name> will link to dynamic cuPVA host library.
                               Since add_library_pva builds a static .a library, this option applies to CMake target transitive
                               dependencies.)
)

Equivalent to add_executable_pva, but emits a static library ``.a ``instead of an executable.

pva_device

pva_device(name <src1> <src2> ...  (sources to build on all generations)
           [NO_DEFAULT_CFLAGS] (By default, some recommended flags are appended when compiling source files.
                                If NO_DEFAULT_CFLAGS is specified, these are skipped.)
           [NO_DEFAULT_LFLAGS] (By default, some recommended flags are appended when linking object files.
                                If NO_DEFAULT_LFLAGS is specified, these are skipped.)
           [PVA_GENS] [[GEN2] [GEN3] [ALL]] (which PVA generations this binary should support.
                                             If not specified, defaults to the value PVA_DEFAULT_GEN)
           [SOURCES_GEN2] <src1> <src2> ... (sources to build for GEN2 only)
           [SOURCES_GEN3] <src1> <src2> ... (sources to build for GEN3 only)
           [LIBS] <lib1> <lib2> ... (name of libraries created with pva_device_lib)
           [CFLAGS] <flag1> <flag2> ... (command line arguments to pass to all compilers)
           [CFLAGS_DEVICE] <flag1> <flag2> ... (command line arguments to pass compiler when building for target processor e.g. VPU)
           [CFLAGS_NATIVE] <flag1> <flag2> ... (command line arguments to pass compiler when building for NATIVE)
           [LFLAGS_DEVICE] <flag1> <flag2> ... (command line arguments to pass linker when building for target processor e.g. VPU)
           [BCF] <bcf script> (linker script for use with VPU builds)
           [USE_NOODLE] (to build VPU sources with chess noodle front end)
)

Define a target for building PVA device code executable. Whether this is built as native or for VPU depends on value of PVA_BUILD_MODE. After invoking this, <name> is a target which can be used in add_executable_pva/add_library_pva. The raw .elf (VPU builds) or .so (native builds) are also built and can be used directly with cupva::Executable::Create() API.

pva_device_lib

pva_device_lib(name <src1> <src2> ...  (sources to build on all generations)
               [NO_DEFAULT_CFLAGS] (By default, some recommended flags are appended when compiling source files.
                                    If NO_DEFAULT_CFLAGS is specified, these are skipped.)
               [PVA_GENS] [[GEN2] [GEN3] [ALL]] (which PVA generations this binary should support.
                                                 If not specified, defaults to the value PVA_DEFAULT_GEN)
               [SOURCES_GEN2] <src1> <src2> ... (sources to build for GEN2 only)
               [SOURCES_GEN3] <src1> <src2> ... (sources to build for GEN3 only)
               [CFLAGS] <flag1> <flag2> ... (command line arguments to pass to all compilers)
               [CFLAGS_DEVICE] <flag1> <flag2> ... (command line arguments to pass compiler when building for target processor e.g. VPU)
               [CFLAGS_NATIVE] <flag1> <flag2> ... (command line arguments to pass compiler when building for NATIVE)
               [USE_NOODLE] (to build VPU sources with chess noodle front end)
)

Define a target for building PVA device code static library. Results can be used in the pva_device LIBS field.

pva_device_import

pva_device_import(name
    [GEN2] <filename | target name>
    [GEN3] <filename | target name>)

Create an imported target which can be used in place of targets created with pva_device_lib or pva_device. Users may specify paths to pre-built binaries on their filesystem, or alternatively the name of a target previously created with a pva_device_* function.

pva_allowlist

pva_allowlist(<name of target> <set name>)

Adds the device target created with one of the pva_device* functions to a VPU allowlist set. For each set, up to three allowlists are generated in the top level build directory:

pva_allowlist_<set name>_gen2 — Contains only VPU ELFs in the set built for gen2
pva_allowlist_<set name>_gen3 — Contains only VPU ELFs in the set built for gen3
pva_allowlist_<set name>_all — Contains all VPU ELFs in the set

All VPU targets are added to the “default” allowlist without user being required to specify pva_allowlist.

pva_device_testlib

pva_device_testlib(name
         <DEVICE EXE ARGS> (Pass any arguments which are recognized by pva_device.)
         PUBLIC_HEADERS <header1> <header2> ... (Headers containing public interfaces for testing methods. Must be able to be compiled
                                                 for both host and device. For example, these headers cannot contain include statements
                                                 for cuPVA host or device headers.)
         ENTRYPOINTS <entrypoint1> <entrypoint2> ... (Name of entrypoints which will be called from host code and implemented in device code.
                                                      Each entrypoint must match the signature void <entrypoint>(T *data).
                                                      T must have the same layout between host and device code and should not contain pointers.)
         [DATA_BANK] [A|B|C|D] (Superbank to use for passing data to test functions. Defaults to Bank B.)
         [DATA_SIZE] <integer> (Maximum size of T in bytes. Defaults to a full superbank, 131072 bytes.)
)

Creates a host-side static library out of device code. Specified device-side entrypoints may be called directly from the host code. This allows calling device-side APIs directly from host code, without needing to use cuPVA host-side APIs to register memory and submit.

An application entrypoint (such as CUPVA_MAIN) should not be provided for this target. Function signatures for ENTRYPOINTS should be present in PUBLIC_HEADERS. Users should not overload ENTRYPOINTS.

Targets created with this function are for testing purposes only and should not be used in production situations.

For a demonstration on using pva_device_testlib, refer to ‘device_test’ in the PVA SDK samples.

Building Device Code#

This section contains some information on how to build VPU applications.

Overriding Default BCF File#

The BCF file (bridge config file) controls linking and memory layout for VPU ELFs and is required to build VPU device code. The cuPVA runtime provides a default BCF which may be overridden with the BCF field of pva_device. This can be useful when you need to carefully layout VPU code for performance reasons.

A custom BCF file should include the default cuPVA runtime BCF file, after optionally specifying some defines to control sizes of special VMEM regions. The following defines may be specified. All sizes are rounded up to 64B boundaries internally:

STACK_SIZE — Controls the size of the stack. Defaults to 0x1000.
GLOBAL_SIZE — Region to use for global/static variables. Does not include variables assigned to specific banks. Defaults to 0x800.
CUPVA_NOMAP_MAIN — Ordinarily, the user defined entrypoint CUPVA_MAIN is mapped at the start of the program. This define can be set to allow user to map the entrypoint to a later part of the program.

Adjusting these sizes up or down may be required for certain applications.

After including the default BCF file, user can specify function layout using _next directive with symbol names. Note that function symbols are mangled by default when building with chess.

For example, a custom BCF file may look like this:

#define STACK_SIZE 0x800 /* Reduce stack size to allow use of more VMEM for buffers */
#include <cupva_vpu.bcf>

_symbol some_mangled_function_name _next
_symbol some_other_mangled_function_name _next

Building Device Code without CMake#

It is possible to build VPU code without using PVA SDK scripts. In particular, the following utilities provide command line access to the VPU toolchain:

chesscc: compiler
ear: static archiver
bridge: linker

These utilities require specifying various paths to the VPU models provided with the PVA SDK. VPU models are installed under tools/vpu_sdks under the PVA SDK install directory.

Refer to the ASIP programmer manual for full details on how to use these tools.