CmdProgram#

Fully qualified name: cupva::CmdProgram

Defined in src/host/cpp_api/include/cupva_host.hpp

class CmdProgram : public cupva::BaseCmd#

Basic unit of work for submission to the PVA engine.

This object combines the following specifications into a unit of work for the PVA engine:

Specifying Executable is optional. This allows user to submit a CmdProgram which only executes DMA operations. If cupva::Executable is specified, user must ensure it remains in a fully constructed and valid state for the greater of (a) lifetime of the CmdProgram object and (b) the duration that the CmdProgram may be executing on the PVA device.

Public Functions

CmdProgram() noexcept#

Construct a default object of CmdProgram.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

CmdProgram(CmdProgram &&obj) noexcept#

Move constructor.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: Yes

    • De-Init: No

CmdProgram &operator=(CmdProgram &&obj) & noexcept#

Move assignment.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: Yes

    • De-Init: No

CmdProgram(CmdProgram const&) = delete#
CmdProgram &operator=(CmdProgram const&) & = delete#
~CmdProgram() noexcept#

Destroy the Program object.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: No

    • De-Init: Yes

void setL2Size(int32_t const size)#

Set CmdProgram instance L2SRAM utilization.

L2SRAM is a fast on-chip memory exposed in CUPVA as a temporary scratchpad type memory.

L2SRAM memory is typically scoped to the CmdProgram. It is allocated on-demand while processing CmdProgram instances from a particular submission. If the allocation request cannot be fulfilled before CmdProgram begins execution, the submission will stall until allocation can succeed.

An L2SRAM allocation is shared between subsequent (in the submission order) CmdProgram instances that have declared a non-zero L2SRAM use and belong to the same submission. The allocation size in this case is defined as the largest allocation among subsequent CmdProgram instances that use L2SRAM.

An application programmer can use cupva::mem::GetL2BaseAddress() API to query a symbolic device pointer to the allocated L2SRAM and use it in data flow APIs.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: Yes

    • De-Init: No

Parameters:

size – The required L2SRAM size in bytes.

Throws:

cupva::Exception(InvalidArgument) – if size < 0 or size > cupva::config::MAX_L2SRAM_SIZE

Parameter operator[](char_t const *const name)#

Get the handler to the VMEM object by querying the literal string name.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: Yes

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Parameters:

name – The literal string name to query.

Throws:
Returns:

Parameter& The reference to the queried Parameter.

template<typename T>
inline auto addDataFlow() -> T&#

Allocate a T-typed dataflow in the Program.

This method allocates a dataflow in current Program object and returns its reference. The Program instance owns the dataflow so the user can only manipulate the reference or the pointer.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Template Parameters:

T – The dataflow type, can be one of: StaticDataFlow, ConfigDataFlow, RasterDataFlow or DynamicDataFlow.

Throws:
  • std::bad_alloc – if memory cannot be allocated for this operation

  • seeGetHardwareInfo() for exceptions which may be thrown when T is RasterDataFlow

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated dataflow in the Program.

template<typename T, typename std::enable_if<!std::is_pointer<T>::value, bool>::type = true>
inline auto addDataFlowHead(
int32_t const phase = 0,
float const allocWeight = 1.0F,
) -> T&#

Allocate a DataFlow head and get its reference.

All chains of DataFlows must commence with a head. This tells CUPVA which DataFlows should be executed first (possibly in parallel in the case of multiple DataFlow heads).

The DMA engine must allocate transfer buffers to each DMA channel. These buffers may be shared by channels which are not active simultaneously. Users can describe which chains of DataFlows may be active simultaneously by assigning phase labels. Chains of DataFlows sharing a phase label may be active simultaneously. Chains of DataFlows with different phase labels may not be active simultaneously.

Specifying phase labels allows CUPVA to allocate DMA transfer buffers between the channels used for each of these DataFlow chains, and share transfer buffers for channels which are not active simultaneously.

The default phase is 0. All linked DataFlows share the same phase index as the head. If a phase is not specified when creating a DataFlow head, it will be added to the default phase, which in many cases will lead to sub-optimal use of transfer buffers.

For example, assuming a dataflow-list represents a chain of DataFlows which start with a head and are linked together with append() or link():

// +----------------+
// | dataflow-list0 |
// +----------------+
//    +----------------+
//    | dataflow-list1 |
//    +----------------+
//                         +----------------+
//                         | dataflow-list2 |
//                         +----------------+
// ---------------------------------------------> time
dataflow-list0 and dataflow-list1 are overlapped while dataflow-list2 is independent. Instead of grouping all three in the same default phase, separating dataflow-list2 into a standalone phase provides better performance for all dataflows.

It is important that user ensures that DataFlow lists which are given different phases with this API are not active at the same time. This should be achieved via careful use of trigger/sync APIs in device code. Failure to ensure this can lead to data corruption or instability.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Template Parameters:
Parameters:
  • phase – Index identifying the Phase the dataflow-list belongs to.

  • allocWeight – Floating point number which will be multiplied with the maximum transfer size linked to this DataFlowHead when determining transfer buffer allocation. Must be positive and non-zero.

Throws:
  • std::bad_alloc – if memory cannot be allocated for this operation

  • seeGetHardwareInfo() and for exceptions which may be thrown when T is RasterDataFlow

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated DataFlow head.

template<typename T, typename std::enable_if<!std::is_pointer<T>::value, bool>::type = true>
inline auto addDataFlowHead(
BaseDataFlow const &sharedDataFlow,
) -> T&#

Allocate a DataFlow head with resource sharing and get its reference.

All chains of DataFlows must commence with a head. This tells CUPVA which DataFlows should be executed first (possibly in parallel in the case of multiple DataFlow heads).

The transfer buffers used by this DataFlow and its linked DataFlows will be shared with the specified BaseDataFlow (which must be a head). These DataFlows must therefore not be active simultaneously. Additionally, any DataFlows created with this API (or linked to such a DataFlow) must not be active concurrently with any DataFlows which have phases different to the specified BaseDataFlow. These conditions should be achieved via careful use of trigger/sync APIs in device code. Failure to ensure this can lead to data corruption or instability.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Template Parameters:
Parameters:

sharedDataFlow – The DataFlow with which to share transfer buffers.

Throws:
  • std::bad_alloc – if memory cannot be allocated for this operation

  • seeGetHardwareInfo() for exceptions which may be thrown when T is RasterDataFlow

  • cupva::Exception – if sharedDataFlow was not created as a head

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

T& The reference to the allocated DataFlow head.

void compileDataFlows()#

Compile the dataflows of the Program.

The dataflows in the program can’t be used without compilation. After allocating and configuring dataflows in the host code, call this method to compile the DataFlows into an internal format which can be submitted to hardware.

The internal representation may be inspected with use of the CUPVA host utility API. This may be beneficial when debugging functional issues or tuning performance.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Throws:
void updateDataFlows()#

Update the dataflows of the Program.

Call this method after updating src/dst OffsetPointers in dataflows so that the corresponding fields in DMA descriptors will be updated.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: Yes

    • De-Init: No

void finalize()#

Destroy the resources created by a BaseCmd object.

This method is exposed to allow fine-grained control over error handling.

During destruction of the object, this method will be called but the destructor must not propagate exceptions. To handle exceptions, manually invoke this method prior to object destruction.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: No

  • API group

    • Init: No

    • Runtime: No

    • De-Init: Yes

Throws:
  • cupva::Exception(DriverAPIError) – if driver returns error during de-initialization of mapped/pinned memory

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Public Static Functions

static CmdProgram Create()#

Construct a new Program without VPU executable code (DMA-only).

Usage considerations

  • Allowed context for the API call

    • Thread-safe: Yes

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Throws:
  • cupva::Exception(DriverAPIError) – if driver API layer returns an error due to:

    • Memory allocation failure

    • Pinning or mapping memory failure

    • Unable to detect chip information

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

Allocated program object.

static CmdProgram Create(const Executable &exec)#

Construct a new Program object and bind an existing VPU executable.

Usage considerations

  • Allowed context for the API call

    • Thread-safe: Yes

  • API group

    • Init: Yes

    • Runtime: No

    • De-Init: No

Parameters:

exec – The reference to a cupva::Executable object. If provided, the reference to this object must remain valid for the lifetime of CmdProgram object.

Throws:
  • cupva::Exception(DriverAPIError) – if driver API layer returns an error due to:

    • Memory allocation failure

    • Pinning or mapping memory failure

    • Unable to detect chip information

  • cupva::Exception(NotAllowedInOperationalState) – if called when NVIDIA DRIVE OS VM state is “Operational”

Returns:

Allocated program object.