DeepStream to Codelet Bridge

DeepStream extensions provide a few components for interoperability between GStreamer plugin extensions and Codelets. These components called as “bridge” components are responsible for pulling data out of underlying DeepStream GStreamer pipeline and translating the DeepStream data structures representing buffer and metadata into the Codelet data components and vice versa.

The bridge components do the task of translating the buffer data between DeepStream and Codelet themselves. For translating other data, the bridge components make use of helper components called “translators” that are derived from the INvDsGxfDataTranslator interface. DeepStream extensions provide some translators as listed later for some of the commonly used data structures. You can implement custom data translators from the interface for custom data types without needing changes in any core DeepStream components.

You may choose the types of data that must be translated and add the corresponding translator components to the same entity as the bridge component. No explicit links to the bridge component are required.

The bridge components are Codelet based components. They require a standard scheduler like GreedyScheduler or MultithreadedScheduler in addition to the NvDsScheduler to be part of the graph. The bridge component execution can be controlled by adding standard scheduling terms to the same entity as the bridge.

DeepStream to Codelet Bridge - NvDsToGxfBridge

The NvDsToGxfBridge component is responsible for pulling data out of the underlying pipeline, translating it to native data components used by Codelets and pushing it to the downstream Codelet components.

It acts as a sink component in the DeepStream portion of the graph. Another DeepStream INvDsElement based component must be linked via it’s in I/O.

NvDsToGxfBridge has two transmitter handle parameters which is used to push data:

frame-tx

Transmitter for translated frame data and related metadata. The bridge supports consuming DeepStream’s batched buffer but only translates and pushes a single frame from the batch at a time. The output message entities generated by the NvDsToGxfBridge component will contain data for a single frame, but the bridge will generate N such messages for a single batched DeepStream buffer of size N before moving to the next buffer.

The bridge component uses zero-copy concept for frame data; frame data is not copied, only representing data structures translated.

The contents of the message entity pushed on this transmitter are:

Component Name

Component Type / Details

Frame

nvidia::gxf::VideoBuffer or nvidia::gxf::AudioBuffer or nvidia::gxf::Tensor, depending on type of buffer (video/audio/raw) being translated

source-id

uint64_t. Unique identifier for the source of the frame in case of a multi-source graph

frame-num

uint64_t. A sequential number for frame originating from the same source (Video only)

pts

uint64_t. A 0-offset based timestamp for the frame assigned by the GStreamer pipeline

ntp

uint64_t. An NTP based timestamp for when the frame was created at the source. See https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_NTP_Timestamp.html for more details.

source-frame-width

uint64_t. Width of the frame at the source, useful in case the frame has undergone any transformations in the pipeline (Video only)

source-frame-height

uint64_t. Height of the frame at the source, useful in case the frame has undergone any transformations in the pipeline (Video only)

surface-type

uint64_t. Surface type of the subframe in a dewarped frame. (Applicable only to dewarped images)

surface-index

uint64_t. Surface index of the subframe in a dewarped frame. (Applicable only to dewarped images)

timestamp

nvidia::gxf::Timestamp. Timestamp component.

The component informs downstream components about the end-of-Stream by pushing an nvidia::gxf::EndOfStream data component on the frame-tx transmitter. This contains the source-id of the corresponding source, if one among multiple sources reaches EoS. The source-id can also be -1 for complete pipeline EoS.

data-tx

Transmitter for other data. Same as frame data, the message entities pushed on this transmitter will contain data for a single frame.

As mentioned earlier, the bridge component uses translator components to translate this data. It calls the translate_ds_to_gxf method of all INvDsGxfDataTranslator based components added to the same entity. It passes the DeepStream buffer and metadata structure along with the output message entity to which all translators must add data components.

Based on the use case, any combination of translators can be added to the bridge entity.

The contents of the message entity pushed on this transmitter are:

Component Name

Component Type / Details

Data components added by translators

source-id

uint64_t. Same as in frame message entity, useful to corelate frame and data message entities

frame-num

uint64_t. Same as in frame message entity, useful to corelate frame and data message entities

timestamp

nvidia::gxf::Timestamp. Timestamp component.

It is optional to link Transmitter components to either of the parameters.

NvDsToGxfBridge uses the AsynchronousSchedulingTerm for executing the bridge entity as soon as data is available from the DeepStream pipeline.

Codelet to DeepStream Bridge - NvGxfToDsBridge

The NvDsToGxfBridge component is responsible for receiving data from upstream Codelet component, translating it to DeepStream data structures and pushing it to the underlying DeepStream pipeline.

It acts as a source component in the DeepStream portion of the graph. Another DeepStream INvDsElement based component must be linked via its out I/O.

NvGxfToDsBridge has two receiver handle parameters to receive the data:

frame-rx

Receiver for frame data and related metadata.

The bridge component uses zero-copy concept for frame data; frame data is not copied, only representing data structures translated.

See the table for frame_table in the previous section on details of the message entity contents that can be consumed by the bridge.

The component can receive nvidia:gxf::EndOfStream messages from upstream Codelet components and push the corresponding events in the DeepStream pipeline.

It is mandatory to link Receiver component to frame-rx.

data-rx

Receiver for other data. Like the frame data, the message entities pushed on this transmitter will contain data for a single frame.

The bridge component uses translator components to translate this data. It calls the translate_gxf_to_ds method of all INvDsGxfDataTranslator based components added to the same entity. It passes the incoming data message entity along with the output DeepStream buffer and metadata structure. The individual translator components translate only those data components in the incoming data message which they understand and update the output DeepStream data structures.

Based on the use case, any combination of translators can be added to the bridge entity.

It is optional to link Receiver component to data-rx

NvGxfToDsBridge pushes data to the DeepStream pipeline as soon as it is received.

Correlating message entities receieved on frame-rx and data-rx

As message entities can be received asynchronously on the two receiver components, the NvGxfToDsBridge component uses acqtime in the native Timestamp component for correlating the two messages. Thus it is mandatory that these messages contain the Timestamp component.

PTS handling

It is necessary to assign a proper PTS (Presentation timestamp) to the GStreamer buffers being pushed to the DeepStream pipeline. The NvGxfToDsBridge component uses the following order of preference for assigning the PTS:

  • pts data component if it is part of the message entity received on frame-rx

  • converting native Timestamp component to the GStreamer PTS if it is part of the message entity received on frame-rx

  • Using the current running system time when the message was received to calculate the PTS.

Translators - The INvDsGxfDataTranslator interface

The INvDsGxfDataTranslator interface is used as a base class for components that can translate between DeepStream Data Structures for Metadata representation (NvDsBatchMeta) and corresponding data components (nvidia::gxf::Tensor and primitive data types like uint64_t` and `int64_t).

The interface provides the following virtual methods. Concrete implementations must implement these methods.

Methods

Details

gxf_result_t translate_ds_to_gxf(

GstBuffer *buffer, NvDsBatchMeta *batch_meta, int frame_idx, nvidia::gxf::Entity message, nvidia::gxf::Handle<nvidia::gxf::Allocator> allocator, nvidia::gxf::MemoryStorageType storage_type)

Implementations must translate the input data represented by GstBuffer and NvDsBatchMeta for a single frame having index frame_idx. The translated output data components must be added to the message entity. The supplied Allocator and MemoryStorageType can be used to allocate memory required by the output data components.

gxf_result_t translate_gxf_to_ds(nvidia::gxf::Entity message,

GstBuffer *buffer, NvDsBatchMeta *batch_meta, int frame_idx)

Implementations must translate the input data represented by data components part of the message entity. The translated DeepStream data must be added to the supplied GstBuffer and/or NvDsBatchMeta for the frame having index frame_idx.

Implementations must define a clear specification of the type of data components handled, so other components can code against these specifications and consume or produce data based on it. The specification must include the name of the component in the message entity and the component type. In addition if the component type is nvidia::gxf::Tensor, the shape of the tensor, the data type of the tensor contents and storage type of the backing memory must be specified. For example, when translating from DeepStream to Codelet, the NvDsObjectDataTranslator component adds a nvidia::gxf::Tensor component in the output message entity with name bbox, shape (N, 4) where N is the number of objects in the frame, data type of float and storage type of gxf::MemoryStorageType::kSystem. Similarly when translating from Codelet to DeepStream, the NvDsObjectDataTranslator component looks for a bbox named component in the input message entity with data type float and shape of (N, 4).

Standard Translators

DeepStream extensions provide a set of translators for commonly used data types:

NvDsGxfObjectDataTranslator

Translates DS object data structure (NvDsObjectMeta) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

Component Name

Component Type / Details

bbox

nvidia::gxf::Tensor, shape - (N, 4), data type - float. Bounding box co-ordinates for all objects in the frame

class-id

nvidia::gxf::Tensor, shape - (N, 1), data type - uint64_t. Class ids for all objects in the frame

confidence

nvidia::gxf::Tensor, shape - (N, 1), data type - float. Detection confidence for all objects in the frame.

object-id

nvidia::gxf::Tensor, shape - (N, 1), data type - int64_t. Object tracking ids for all objects in the frame. -1 for untracked objects

tracker-confidence

nvidia::gxf::Tensor, shape - (N, 1), data type - float. Tracking confidence for all objects in the frame.

object-label

nvidia::gxf::Tensor, shape - (N, L), data type - uint8_t. String labels for all objects in the frame.

detector-bbox

nvidia::gxf::Tensor, shape - (N, 4), data type - float. Original bounding box co-ordinates generated by detector

tracker-bbox

nvidia::gxf::Tensor, shape - (N, 4), data type - float. Original bounding box co-ordinates generated by tracker

classification-<cid>

nvidia::gxf::Tensor, shape - (N, M, 1), data type - uint64_t. Indexes for the output classes for all objects in the frame

classification-confidence-<cid>

nvidia::gxf::Tensor, shape - (N, M, 1), data type - float. Classification probability for the output classes for all objects in the frame

classification-label-<cid>

nvidia::gxf::Tensor, shape - (N, M, L), data type - uint8_t. String labels for the output classes for all objects in the frame

instance-segmentation

nvidia::gxf::Tensor, shape - (N, rows, cols), data type - float. Raw instance segmentation outputs for all objects in the frame

instance-segmentation-valid

nvidia::gxf::Tensor, shape - (N, 1), data type - uint8_t. Boolean indicating if the instance segmentation mask is valid for the object

N - Number of objects in the frame

cid - Component ID that generated the meta data

M - Number of types of classification supported by the component. > 1 for multi-label classification models

L - Maximum length of string label

NvDsGxfAudioClassificationDataTranslator

Translates DS audio classification data structure (NvDsAudioFrameMeta/NvDsClassifierMeta) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

Component Name

Component Type / Details

audio-classification-<cid>

nvidia::gxf::Tensor, shape - (M, 1), data type - uint64_t. Indexes for the output classes for the current audio frame

audio-classification-confidence-<cid>

nvidia::gxf::Tensor, shape - (M, 1), data type - float. Classification probability for the output classes for the current audio frame

audio-classification-label-<cid>

nvidia::gxf::Tensor, shape - (M, L), data type - uint8_t. String labels for the output classes for the current audio frame

cid - Component ID that generated the meta data

M - Number of types of classification supported by the component. > 1 for multi-label classification models

L - Maximum length of string label

NvDsGxfOpticalFlowDataTranslator

Translates DS optical flow data structure (NvDsOpticalFlowMeta) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

Component Name

Component Type / Details

motion-vectors

nvidia::gxf::Tensor, shape - (rows, columns, 2). data type - int16_t. contains motion vectors for x and y directions.

NvDsGxfSegmentationDataTranslator

Translates DS segmentation data structure (NvDsInferSegmentationMeta) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

Component Name

Component Type / Details

seg-class-map

nvidia::gxf::Tensor, shape - (rows, columns), data type - int32. Per-pixel map of class-ids with highest classification probability for that pixel

seg-class-probability-matrix

nvidia::gxf::Tensor, shape - (rows, columns, classes), data type - float. Raw segmentation output containing classification probabilities for all pixels for all classes

NvDsGxfInferTensorDataTranslator

Translates DS infer tensor data structure(NvDsInferTensorMeta) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

Component Name

Component Type / Details

infer-tensor-frame-<layer-name>

nvidia::gxf::Tensor. Contains data for frame-level raw inference output for layer with name layer name. Shape and data type depends on the model layer shape and data type

infer-tensor-object-<layer-name>

nvidia::gxf::Tensor. Contains data for object-level raw inference output for layer with name layer name. Shape and data type depends on the model layer shape and data type, shape has an additional highest order dimension of N (number of objects in the frame.