CmdProgram#

Fully qualified name: cupva::CmdProgram

Defined in src/host/cpp_api/include/cupva_host.hpp

class CmdProgram : public cupva::BaseCmd #

Basic unit of work for submission to the PVA engine.

This object combines the following specifications into a unit of work for the PVA engine:

cupva::Executable
Parameter configuration (requires Executable to be specified)
Dataflow configuration

Specifying Executable is optional. This allows user to submit a CmdProgram which only executes DMA operations. If cupva::Executable is specified, user must ensure it remains in a fully constructed and valid state for the greater of (a) lifetime of the CmdProgram object and (b) the duration that the CmdProgram may be executing on the PVA device.

Public Functions

CmdProgram() noexcept#

Construct a default object of CmdProgram.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: Yes
- Runtime: No
- De-Init: No

CmdProgram(CmdProgram &&obj) noexcept#

Move constructor.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: Yes
- De-Init: No

CmdProgram &operator=(CmdProgram &&obj) & noexcept#

Move assignment.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: Yes
- De-Init: No

CmdProgram(CmdProgram const&) = delete#

CmdProgram &operator=(CmdProgram const&) & = delete#

~CmdProgram() noexcept#

Destroy the Program object.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: No
- De-Init: Yes

void setL2Size(int32_t const size)#

Set CmdProgram instance L2SRAM utilization.

L2SRAM is a fast on-chip memory exposed in CUPVA as a temporary scratchpad type memory.

L2SRAM memory is typically scoped to the CmdProgram. It is allocated on-demand while processing CmdProgram instances from a particular submission. If the allocation request cannot be fulfilled before CmdProgram begins execution, the submission will stall until allocation can succeed.

An L2SRAM allocation is shared between subsequent (in the submission order) CmdProgram instances that have declared a non-zero L2SRAM use and belong to the same submission. The allocation size in this case is defined as the largest allocation among subsequent CmdProgram instances that use L2SRAM.

An application programmer can use cupva::mem::GetL2BaseAddress() API to query a symbolic device pointer to the allocated L2SRAM and use it in data flow APIs.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: Yes
- De-Init: No

Parameters:: size – The required L2SRAM size in bytes.
Throws:: cupva::Exception(InvalidArgument) – if size < 0 or size > cupva::config::MAX_L2SRAM_SIZE

Parameter operator[](char_t const *const name)#

Get the handler to the VMEM object by querying the literal string name.

Usage considerations

Allowed context for the API call
- Thread-safe: Yes
API group
- Init: Yes
- Runtime: No
- De-Init: No

Parameters:

name – The literal string name to query.

Throws:

cupva::Exception(InvalidArgument) – if the CmdProgram was not created with a reference to a cupva::Executable
cupva::Exception(InvalidArgument) – if name is longer than cupva::config::MAX_SYMBOL_LENGTH or is not null-terminated
cupva::Exception(DriverAPIError) – if name could not be looked up from symbols known to the driver API layer for the underlying Executable

Returns:

Parameter& The reference to the queried Parameter.

template<typename T> inline auto addDataFlow() -> T&#

Allocate a T-typed dataflow in the Program.

This method allocates a dataflow in current Program object and returns its reference. The Program instance owns the dataflow so the user can only manipulate the reference or the pointer.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: Yes
- Runtime: No
- De-Init: No

Template Parameters:

T – The dataflow type, can be one of: StaticDataFlow, ConfigDataFlow, RasterDataFlow or DynamicDataFlow.

Throws:

std::bad_alloc – if memory cannot be allocated for this operation
see – GetHardwareInfo() for exceptions which may be thrown when T is RasterDataFlow
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated dataflow in the Program.

template<typename T, typename std::enable_if<!std::is_pointer<T>::value, bool>::type = true> inline auto addDataFlowHead( int32_t const phase = 0, float const allocWeight = 1.0F, ) -> T&#

Allocate a DataFlow head and get its reference.

All chains of DataFlows must commence with a head. This tells CUPVA which DataFlows should be executed first (possibly in parallel in the case of multiple DataFlow heads).

The DMA engine must allocate transfer buffers to each DMA channel. These buffers may be shared by channels which are not active simultaneously. Users can describe which chains of DataFlows may be active simultaneously by assigning phase labels. Chains of DataFlows sharing a phase label may be active simultaneously. Chains of DataFlows with different phase labels may not be active simultaneously.

Specifying phase labels allows CUPVA to allocate DMA transfer buffers between the channels used for each of these DataFlow chains, and share transfer buffers for channels which are not active simultaneously.

The default phase is 0. All linked DataFlows share the same phase index as the head. If a phase is not specified when creating a DataFlow head, it will be added to the default phase, which in many cases will lead to sub-optimal use of transfer buffers.

For example, assuming a dataflow-list represents a chain of DataFlows which start with a head and are linked together with append() or link():

// +----------------+
// | dataflow-list0 |
// +----------------+
//    +----------------+
//    | dataflow-list1 |
//    +----------------+
//                         +----------------+
//                         | dataflow-list2 |
//                         +----------------+
// ---------------------------------------------> time

dataflow-list0 and dataflow-list1 are overlapped while dataflow-list2 is independent. Instead of grouping all three in the same default phase, separating dataflow-list2 into a standalone phase provides better performance for all dataflows.

It is important that user ensures that DataFlow lists which are given different phases with this API are not active at the same time. This should be achieved via careful use of trigger/sync APIs in device code. Failure to ensure this can lead to data corruption or instability.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: Yes
- Runtime: No
- De-Init: No

Template Parameters:

T – The dataflow type, can be one of: StaticDataFlow, ConfigDataFlow, RasterDataFlow or DynamicDataFlow.
std::enable_if<!std::is_pointer<T>::value, bool>::type – Assert T is not pointer type.

Parameters:

phase – Index identifying the Phase the dataflow-list belongs to.
allocWeight – Floating point number which will be multiplied with the maximum transfer size linked to this DataFlowHead when determining transfer buffer allocation. Must be positive and non-zero.

Throws:

std::bad_alloc – if memory cannot be allocated for this operation
see – GetHardwareInfo() and for exceptions which may be thrown when T is RasterDataFlow
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated DataFlow head.

template<typename T, typename std::enable_if<!std::is_pointer<T>::value, bool>::type = true> inline auto addDataFlowHead( BaseDataFlow const &sharedDataFlow, ) -> T&#

Allocate a DataFlow head with resource sharing and get its reference.

All chains of DataFlows must commence with a head. This tells CUPVA which DataFlows should be executed first (possibly in parallel in the case of multiple DataFlow heads).

The transfer buffers used by this DataFlow and its linked DataFlows will be shared with the specified BaseDataFlow (which must be a head). These DataFlows must therefore not be active simultaneously. Additionally, any DataFlows created with this API (or linked to such a DataFlow) must not be active concurrently with any DataFlows which have phases different to the specified BaseDataFlow. These conditions should be achieved via careful use of trigger/sync APIs in device code. Failure to ensure this can lead to data corruption or instability.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: Yes
- Runtime: No
- De-Init: No

Template Parameters:

T – The dataflow type, can be one of: StaticDataFlow, ConfigDataFlow, RasterDataFlow or DynamicDataFlow.
std::enable_if<!std::is_pointer<T>::value, bool>::type – Assert T is not pointer type.

Parameters:

sharedDataFlow – The DataFlow with which to share transfer buffers.

Throws:

std::bad_alloc – if memory cannot be allocated for this operation
see – GetHardwareInfo() for exceptions which may be thrown when T is RasterDataFlow
cupva::Exception – if sharedDataFlow was not created as a head
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated DataFlow head.

void compileDataFlows()#

Compile the dataflows of the Program.

The dataflows in the program can’t be used without compilation. After allocating and configuring dataflows in the host code, call this method to compile the DataFlows into an internal format which can be submitted to hardware.

The internal representation may be inspected with use of the CUPVA host utility API. This may be beneficial when debugging functional issues or tuning performance.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: Yes
- Runtime: No
- De-Init: No

Throws:

cupva::Exception(InvalidArgument) – If the RasterDataFlow’s handler is not set
cupva::Exception(InvalidArgument) – If both RasterDataFlow’s src and dst buffers are in VMEM or neither of them are in VMEM
cupva::Exception(InvalidArgument) – If the RasterDataFlow’s tile size is not set
cupva::Exception(InvalidArgument) – If the RasterDataFlow’s roi size is smaller than the tile size
cupva::Exception(InvalidArgument) – If the RasterDataFlow’s roi size exceeds the limitations
cupva::Exception(InvalidArgument) – If the RasterDataFlow is configured to write out data from VMEM but padDim is non-zero
cupva::Exception(InvalidArgument) – If the RasterDataFlow objects are linked circularly
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s src is not set
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s dst is not set
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s src line-pitch is not set but the tile height is greater than 1
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s src line-pitch is not smaller than tile width - horizontal padDim
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s dst line-pitch is not set but the tile height is greater than 1
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s dst line-pitch is not smaller than tile width - horizontal padDim
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s src iteration number doesn’t equal to dst
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s horizontal padDim is greater than tile width when tile width is non-zero
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s vertical padDim is greater than tile height when tile height is non-zero
cupva::Exception(InvalidArgument) – If the StaticDataFlow uses DMA padding but dst is not in VMEM
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s byte-per-pixel is not in [1, 2, 4]
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s src circular buffer is enabled but its length is not 64-byte-aligned
cupva::Exception(InvalidArgument) – If the StaticDataFlow’s dst circular buffer is enabled but its length is not 64-byte-aligned
cupva::Exception(InvalidState) – If a StaticDataFlow is appended in a list and its granularity is ALL but assigned a trigger
cupva::Exception(InvalidState) – If a StaticDataFlow is configured as no-trigger but its granularity is not ALL
cupva::Exception(InvalidState) – If a ConfigDataFlow is appended in a list but assigned a trigger
cupva::Exception(InvalidArgument) – If the ConfigDataFlow’s src is not set
cupva::Exception(InvalidState) – If the number of descriptors exceeds the limitation(64)
cupva::Exception(InvalidState) – If a DataFlow object is allocated as head but linked or appended
cupva::Exception(InvalidState) – If the DataFlow objects in a list have different padding values when using constant-padding
cupva::Exception(InvalidState) – If a parameter is assigned to 2 DataFlows but their trigger types conflict
cupva::Exception(InvalidState) – If the number of allocating triggers exceeds the hardware limitation
cupva::Exception(InvalidState) – If there are no available reading/writing DataFlowBuffer slots
cupva::Exception(InvalidState) – The object is not instantiated correctly
cupva::Exception(DriverAPIError) – if driver API layer returns error during allocation of resources for compiled DMA configuration
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

void updateDataFlows()#

Update the dataflows of the Program.

Call this method after updating src/dst OffsetPointers in dataflows so that the corresponding fields in DMA descriptors will be updated.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: Yes
- De-Init: No

void finalize()#

Destroy the resources created by a BaseCmd object.

This method is exposed to allow fine-grained control over error handling.

During destruction of the object, this method will be called but the destructor must not propagate exceptions. To handle exceptions, manually invoke this method prior to object destruction.

Usage considerations

Allowed context for the API call
- Thread-safe: No
API group
- Init: No
- Runtime: No
- De-Init: Yes

Throws:

cupva::Exception(DriverAPIError) – if driver returns error during de-initialization of mapped/pinned memory
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Public Static Functions

static CmdProgram Create()#

Construct a new Program without VPU executable code (DMA-only).

Usage considerations

Allowed context for the API call
- Thread-safe: Yes
API group
- Init: Yes
- Runtime: No
- De-Init: No

Throws:

cupva::Exception(DriverAPIError) – if driver API layer returns an error due to:
- Memory allocation failure
- Pinning or mapping memory failure
- Unable to detect chip information
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

Allocated program object.

static CmdProgram Create(const Executable &exec)#

Construct a new Program object and bind an existing VPU executable.

Usage considerations

Allowed context for the API call
- Thread-safe: Yes
API group
- Init: Yes
- Runtime: No
- De-Init: No

Parameters:

exec – The reference to a cupva::Executable object. If provided, the reference to this object must remain valid for the lifetime of CmdProgram object.

Throws:

cupva::Exception(DriverAPIError) – if driver API layer returns an error due to:
- Memory allocation failure
- Pinning or mapping memory failure
- Unable to detect chip information
cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

Allocated program object.