Dataflow configuration#
APIs to configure DataFlows which have been requested from a CmdProgram.
Each VPU task launch consists of a cupva::Executable associated with a number of Dataflows via the cupva::CmdProgram object. The Dataflows are used to describe the required DMA configuration; these APIs therefore allow programming the PVA’s DMA engine.
The DataFlow APIs accept cuPVA device pointers, which have been allocated by the client. These allocations must be appropriate for the intended use by the PVA’s DMA engine. Buffers must be allocated with correct access permissions. Output buffers must have write permissions and input buffers must have read permissions. The buffers must be large enough such that the DMA accesses are within the allocated buffer range. Allocating a buffer with cupva::mem:Alloc will result in a buffer which may be read/written by the PVA’s DMA engine.
Specifying DataFlows which require VPU triggering creates a contract between the host DataFlow setup code and the device DataFlow triggering code. The user must ensure that any cupva::CmdProgram containing DataFlows which require VPU triggering are submitted with a cupva::Executable. This cupva::Executable must ensure that all DataFlows are triggered to completion. The CmdProgram will not be considered completed by the system until all DataFlows have been completely triggered.
Classes#
- cupva::BaseDataFlow
DataFlow base class.
- cupva::GatherScatterDataFlow
The GatherScatterDataFlow provides the ability to load data from arbitrary addresses or store data to disparate locations in external memory.
- cupva::RasterDataFlow
RasterDataFlow is a DataFlow abstraction for processing tiles by raster scanning.
- cupva::SequenceDataFlow
SequenceDataFlow describes data transfers which execute sequentially.
- cupva::SequenceDataFlowTransfer
Represents a transfer within a SequenceDataFlow .
- cupva::TensorDataFlow
TensorDataFlow is a DataFlow abstraction for accessing 3D tiles.
Enumerations#
- uint32_t cupva::GranType
Enumeration of dataflow trigger granularity.
- uint32_t cupva::PadDirType
Define PadDirType as uint8_t using one-hot encoding.
- uint32_t cupva::PadModeType
Enumeration of dataflow-padding types.
- uint32_t cupva::ScanOrderType
Scan orders of tiling an image.
- uint32_t cupva::TransferModeType
SequenceDataFlowTransfer sequencing mode.
Enumerations#
-
enum class cupva::GranType : uint32_t#
Enumeration of dataflow trigger granularity.
This controls the boundary where the VPU-DMA handshake occurs during execution of a StaticDataFlow. At each handshake boundary, the VPU engine must synchronize the DataFlow and re-trigger before the dataflow will continue.
TILE: handshake occurs after transferring each tile.
DIM1: handshake occurs after transferring niter1 tiles, where niter1 is the first dimension of the transfer in VMEM (see srcDim1/dstDim1)
DIM2: handshake occurs after transferring niter2 iterations of dim2, where niter2 is the second dimension of the transfer in VMEM (see srcDim2/dstDim2)
ALL: handshake occurs after transferring all tiles in the DataFlow. This handshake can be omitted by using append() instead of link().
Values:
-
enumerator TILE#
-
enumerator DIM1#
-
enumerator DIM2#
-
enumerator ALL#
-
enum cupva::PadDirType#
Define PadDirType as uint8_t using one-hot encoding.
Padding is allowed in both x and y dimensions simultaneously. One-hot encoding enables bitwise-or:
Padding is only allowed in one direction per dimension:// PAD_TOP | PAD_LEFT: // +------------------+ +------------------+ // | pad TOP | | | // | +-------------+ | | // |pad | | -> | dst tile | // |LEFT| src buff | | | // | | | | | // +----+-------------+ +------------------+
// PAD_TOP | PAD_LEFT - valid // PAD_TOP | PAD_BOTTOM - invalid, cannot pad in 2 directions on y-axis simultaneously
Values:
-
enumerator PAD_NONE#
no padding
-
enumerator PAD_TOP#
padding to top edge
-
enumerator PAD_BOTTOM#
padding to bottom edge
-
enumerator INVALID_PAD_TOP_BOTTOM#
invalid padding to top and bottom at the same time
-
enumerator PAD_LEFT#
padding to left edge
-
enumerator PAD_TOP_LEFT#
padding to top and left edges
-
enumerator PAD_BOTTOM_LEFT#
padding to bottom and left edges
-
enumerator INVALID_PAD_TOP_BOTTOM_LEFT#
invalid padding to top, bottom and left a the same time
-
enumerator PAD_RIGHT#
padding to right edge
-
enumerator PAD_TOP_RIGHT#
padding to top and right edges
-
enumerator PAD_BOTTOM_RIGHT#
padding to bottom and right edges
-
enumerator INVALID_PAD#
invalid padding combination for this value and any larger integer
-
enumerator PAD_NONE#
-
enum cupva::PadModeType#
Enumeration of dataflow-padding types.
StaticDataFlow supports 2 types of padding:
Padding with a constant value (0 for instance)
// +------------------+ // | 0 0 0 0 | // | +-------------+ // | 0 | | // | 0 | | // | 0 | | // +----+-------------+
Padding with boundary-pixel-extension
// +------------------+ // | a a b c | // | +-------------+ // | a | a b c | // | d | d | // | e | e | // +----+-------------+
Values:
-
enumerator PAD_CONST#
Padding with a constant value
-
enumerator PAD_BPE#
Padding with boundary-pixel-extension
-
enum cupva::ScanOrderType#
Scan orders of tiling an image.
This controls the tile traversal order in an image. CUPVA can support up to 8 orders using bitmask. bit 0: 0 - horizontally left to right, 1 - right to left bit 1: 0 - vertically top to bottom, 1 - bottom to top bit 2: 0 - row major, 1 - column major
Values:
-
enumerator HORIZONTAL_REVERSED#
Horizontal right to left scan
-
enumerator VERTICAL_REVERSED#
Vertical bottom to top
-
enumerator COLUMN_MAJOR#
Scan column by column
-
enumerator SCANORDER_MASK_ALL#
Scan order mask
-
enumerator HORIZONTAL_REVERSED#
-
enum class cupva::TransferModeType : uint32_t#
SequenceDataFlowTransfer sequencing mode.
SequencingType specifies the boundaries where a SequenceDataFlow must be triggered or synced.
SequenceDataFlows are sequenced in device code as follows:
When cupvaSQDFOpen() is called, the SequenceDataFlow is idle. When a SequenceDataFlow is idle:
User may call cupvaSQDFClose() to permanently close the SequenceDataFlow
User may call cupvaSQDFFlushAndTrig() to flush pending updates and trigger the SequenceDataFlow. Further updates may not be added until the next call to cupvaSQDFSync().
User may call cupvaSQDFTrig() to trigger without flushing pending updates.
After triggering with either cupvaSQDFFlushAndTrig() or cupvaSQDFTrig(), user must call cupvaSQDFSync(). This will block until a point determined by the TransferMode of the transfers. For example:
If all transfers are CONTINUOUS, the SequenceDataFlow will be idle after cupvaSQDFSync().
If the first transfer is TILE, the first tile of the first transfer will have been transferred after cupvaSQDFSync().
cupvaSQDFTrig/Sync() are called as necessary to complete the sequence. See examples below.
Once the sequence has completed, the SequenceDataFlow is again idle and may be closed, triggered, or triggered with updates being flushed. The SequenceDataFlow must eventually be closed before VPU code exits.
Some examples follow.
// If all SequenceDataFlowTransfers are CONTINUOUS, all transfers will automatically commence // by calling cupvaSQDFTrig()/cupvaSQDFFlushAndTrig(). It is required to call cupvaSQDFSync() // to complete the sequence. // Sequence order // │╭────────────╮◀─Trig/FlushAndTrig // ││ CONTINUOUS │ // │╰────────────╯ // │╭────────────╮ // ││ CONTINUOUS │ // │╰────────────╯ // │╭────────────╮ // ││ CONTINUOUS │ // │╰────────────╯─▶Sync // ▼
// The amount of data transferred by triggers depends on the sequencing mode of the // SequenceDataFlowTransfers. For example, a chain of TRANSFER level sequencing: // Sequence order // │╭────────────╮◀─Trig/FlushAndTrig // ││ TRANSFER │ // │╰────────────╯─▶Sync // │╭────────────╮◀─Trig // ││ TRANSFER │ // │╰────────────╯─▶Sync // │╭────────────╮◀─Trig // ││ TRANSFER │ // │╰────────────╯─▶Sync // │ // ▼
// For TILE, DIM1 and DIM2, there may be multiple Trig/Sync pairs within a single // SequenceDataFlowTransfer. For example: // Sequence order // │╭────────────╮◀─Trig/FlushAndTrig // ││ TILE │─▶Sync (first tile is transferred) // ││ │◀─Trig // ││ │─▶Sync (second tile is transferred) // ││ │◀─Trig // │╰────────────╯─▶Sync (first transfer is complete) // │╭────────────╮◀─Trig // ││ TRANSFER │ // │╰────────────╯─▶Sync // │╭────────────╮◀─Trig // ││ TRANSFER │ // │╰────────────╯─▶Sync // ▼
// If a block of CONTINUOUS SequenceDataFlowTransfers are placed after non-CONTINUOUS transfers, // a call to cupvaSQDFTrig() is required at the start of this block, and cupvaSQDFSync() is // required at the end: // Sequence order // │╭────────────╮◀─Trig/FlushAndTrig // ││ CONTINUOUS │ // │╰────────────╯─▶Sync // │╭────────────╮◀─Trig // ││ TRANSFER │ // │╰────────────╯─▶Sync // │╭────────────╮◀─Trig // ││ CONTINUOUS │ // │╰────────────╯ // │╭────────────╮ // ││ CONTINUOUS │ // │╰────────────╯─▶Sync // ▼
Values:
-
enumerator TILE#
Trigger is required at the start of each tile, Sync is required to check that a tile has completed.
-
enumerator DIM1#
Trigger is required to start a batch of niter1 tiles, Sync is required to check that the batch has completed (see srcDim1/dstDim1).
-
enumerator DIM2#
Trigger is required to start a batch of niter2 tiles, Sync is required to check that the batch has completed (see srcDim2/dstDim2).
-
enumerator TRANSFER#
Trigger is required at the start of the transfer, Sync is required to check that the transfer has completed.
-
enumerator CONTINUOUS#
Trigger is required at the start of a block of CONTINUOUS transfers, Sync is used to check that the block has completed.