DeepStream to Codelet Bridge
================================

DeepStream extensions provide a few components for interoperability between GStreamer plugin extensions and Codelets. These
components called as "bridge" components are responsible for pulling data out of underlying DeepStream GStreamer pipeline and translating the DeepStream
data structures representing buffer and metadata into the Codelet data components and vice versa.

The bridge components do the task of translating the buffer data between DeepStream and Codelet themselves. For translating other data, the bridge components
make use of helper components called "translators" that are derived from the ``INvDsGxfDataTranslator`` interface. DeepStream extensions provide some
translators as listed later for some of the commonly used data structures. You can implement custom data translators from the interface for custom
data types without needing changes in any core DeepStream components.

You may choose the types of data that must be translated and add the corresponding translator components to the same entity as the bridge component. No explicit
links to the bridge component are required.

The bridge components are Codelet based components. They require a standard scheduler like ``GreedyScheduler`` or ``MultithreadedScheduler`` in addition to
the ``NvDsScheduler`` to be part of the graph. The bridge component execution can be controlled by adding standard scheduling terms to the same entity as the
bridge.


DeepStream to Codelet Bridge - NvDsToGxfBridge
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
The ``NvDsToGxfBridge`` component is responsible for pulling data out of the underlying pipeline, translating it to native data components used by Codelets and pushing it to
the downstream Codelet components.

It acts as a sink component in the DeepStream portion of the graph. Another DeepStream ``INvDsElement`` based component must be linked via it's ``in`` I/O.

``NvDsToGxfBridge`` has two transmitter handle parameters which is used to push data:

.. _frame_table:
**frame-tx**

Transmitter for translated frame data and related metadata. The bridge supports consuming DeepStream's batched buffer but only translates and pushes
a single frame from the batch at a time. The output message entities generated by the ``NvDsToGxfBridge`` component will contain data for a single frame,
but the bridge will generate ``N`` such messages for a single batched DeepStream buffer of size ``N`` before moving to the next buffer.

The bridge component uses zero-copy concept for frame data; frame data is not copied, only representing data structures translated.

The contents of the message entity pushed on this transmitter are:

======================     ===================================================================================================================================================================================================================
  Component Name            Component Type / Details
======================     ===================================================================================================================================================================================================================
  Frame                     ``nvidia::gxf::VideoBuffer`` or ``nvidia::gxf::AudioBuffer`` or ``nvidia::gxf::Tensor``, depending on type of buffer (video/audio/raw) being translated
  source-id                 uint64_t. Unique identifier for the source of the frame in case of a multi-source graph
  frame-num                 uint64_t. A sequential number for frame originating from the same source (Video only)
  pts                       uint64_t. A 0-offset based timestamp for the frame assigned by the GStreamer pipeline
  ntp                       uint64_t. An NTP based timestamp for when the frame was created at the source. See https://docs.nvidia.com/metropolis/deepstream/dev-guide/text/DS_NTP_Timestamp.html for more details.
  source-frame-width        uint64_t. Width of the frame at the source, useful in case the frame has undergone any transformations in the pipeline (Video only)
  source-frame-height       uint64_t. Height of the frame at the source, useful in case the frame has undergone any transformations in the pipeline (Video only)
  surface-type              uint64_t. Surface type of the subframe in a dewarped frame. (Applicable only to dewarped images)
  surface-index             uint64_t. Surface index of the subframe in a dewarped frame. (Applicable only to dewarped images)
  timestamp                 ``nvidia::gxf::Timestamp``. Timestamp component.
======================     ===================================================================================================================================================================================================================

The component informs downstream components about the end-of-Stream by pushing an ``nvidia::gxf::EndOfStream`` data component on the frame-tx transmitter. This contains the source-id of the corresponding source, if
one among multiple sources reaches EoS. The source-id can also be ``-1`` for complete pipeline EoS.

**data-tx**

Transmitter for other data. Same as frame data, the message entities pushed on this transmitter will contain data for a single frame.

As mentioned earlier, the bridge component uses translator components to translate this data. It calls the ``translate_ds_to_gxf`` method of all ``INvDsGxfDataTranslator``
based components added to the same entity. It passes the DeepStream buffer and metadata structure along with the output message entity to which all translators must add
data components.

Based on the use case, any combination of translators can be added to the bridge entity.

The contents of the message entity pushed on this transmitter are:

======================     ===============================================================================================================
  Component Name            Component Type / Details
======================     ===============================================================================================================
  ...                       Data components added by translators
  source-id                 uint64_t. Same as in frame message entity, useful to corelate frame and data message entities
  frame-num                 uint64_t. Same as in frame message entity, useful to corelate frame and data message entities
  timestamp                 ``nvidia::gxf::Timestamp``. Timestamp component.
======================     ===============================================================================================================

It is optional to link Transmitter components to either of the parameters.

``NvDsToGxfBridge`` uses the ``AsynchronousSchedulingTerm`` for executing the bridge entity as soon as data is available from the DeepStream pipeline.


Codelet to DeepStream Bridge - NvGxfToDsBridge
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``NvDsToGxfBridge`` component is responsible for receiving data from upstream Codelet component, translating it to DeepStream data
structures and pushing it to the underlying DeepStream pipeline.

It acts as a source component in the DeepStream portion of the graph. Another DeepStream ``INvDsElement`` based component must be linked via its `out` I/O.

``NvGxfToDsBridge`` has two receiver handle parameters to receive the data:

**frame-rx**

Receiver for frame data and related metadata.

The bridge component uses zero-copy concept for frame data; frame data is not copied, only representing data structures translated.

See the table for :ref:`frame_table` in the previous section on details of the message entity contents that can be consumed by the bridge.

The component can receive ``nvidia:gxf::EndOfStream`` messages from upstream Codelet components and push the corresponding events in the DeepStream pipeline.

It is mandatory to link Receiver component to ``frame-rx``.

**data-rx**

Receiver for other data. Like the frame data, the message entities pushed on this transmitter will contain data for a single frame.

The bridge component uses translator components to translate this data. It calls the ``translate_gxf_to_ds`` method of all ``INvDsGxfDataTranslator``
based components added to the same entity. It passes the incoming data message entity along with the output DeepStream buffer and metadata structure. The individual translator components
translate only those data components in the incoming data message which they understand and update the output DeepStream data structures.

Based on the use case, any combination of translators can be added to the bridge entity.

It is optional to link Receiver component to ``data-rx``

``NvGxfToDsBridge`` pushes data to the DeepStream pipeline as soon as it is received.

**Correlating message entities receieved on frame-rx and data-rx**

As message entities can be received asynchronously on the two receiver components, the ``NvGxfToDsBridge`` component uses ``acqtime`` in the native Timestamp component for correlating the two messages.
Thus it is mandatory that these messages contain the Timestamp component.

**PTS handling**

It is necessary to assign a proper PTS (Presentation timestamp) to the GStreamer buffers being pushed to the DeepStream pipeline. The ``NvGxfToDsBridge`` component uses the following
order of preference for assigning the PTS:

- ``pts`` data component if it is part of the message entity received on ``frame-rx``
- converting native Timestamp component to the GStreamer PTS if it is part of the message entity received on ``frame-rx``
- Using the current running system time when the message was received to calculate the PTS.

Translators - The INvDsGxfDataTranslator interface
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The ``INvDsGxfDataTranslator`` interface is used as a base class for components that can translate between DeepStream Data Structures for Metadata representation
(``NvDsBatchMeta``) and corresponding data components (``nvidia::gxf::Tensor`` and primitive data types like ``uint64_t` and `int64_t``).

The interface provides the following virtual methods. Concrete implementations must implement these methods.

+----------------------------------------------------------------------+---------------------------------------------------------------------------------+
| Methods                                                              |       Details                                                                   |
+======================================================================+=================================================================================+
|   gxf_result_t translate_ds_to_gxf(                                  | Implementations must translate the input data represented by GstBuffer and      |
|       GstBuffer \*buffer, NvDsBatchMeta \*batch_meta, int frame_idx, | ``NvDsBatchMeta`` for a single frame having index ``frame_idx``. The translated |
|       nvidia::gxf::Entity message,                                   | output data components must be added to the ``message`` entity. The             |
|       nvidia::gxf::Handle<nvidia::gxf::Allocator> allocator,         | supplied Allocator and MemoryStorageType can be used to allocate memory         |
|       nvidia::gxf::MemoryStorageType storage_type)                   | required by the output data components.                                         |   
+----------------------------------------------------------------------+---------------------------------------------------------------------------------+
|   gxf_result_t translate_gxf_to_ds(nvidia::gxf::Entity message,      | Implementations must translate the input data represented by data               |
|       GstBuffer \*buffer,                                            | components part of the ``message`` entity. The translated DeepStream data must  |
|       NvDsBatchMeta \*batch_meta,                                    | be added to the supplied ``GstBuffer`` and/or ``NvDsBatchMeta`` for the frame   |
|       int frame_idx)                                                 | having index ``frame_idx``.                                                     |
+----------------------------------------------------------------------+---------------------------------------------------------------------------------+

Implementations must define a clear specification of the type of data components handled, so other components can code against these specifications and consume or produce data based on it.
The specification must include the name of the component in the message entity and the component type. In addition if the component type is ``nvidia::gxf::Tensor``, the
shape of the tensor, the data type of the tensor contents and storage type of the backing memory must be specified. For example, when translating from DeepStream to Codelet, the
``NvDsObjectDataTranslator`` component adds a ``nvidia::gxf::Tensor`` component in the output message entity with name `bbox`, `shape (N, 4)` where `N` is the number of objects in the frame, data type
of `float` and storage type of ``gxf::MemoryStorageType::kSystem``. Similarly when translating from Codelet to DeepStream, the ``NvDsObjectDataTranslator`` component looks for a `bbox` named component in
the input message entity with data type `float` and `shape` of `(N, 4)`.


Standard Translators
^^^^^^^^^^^^^^^^^^^^^^^^

DeepStream extensions provide a set of translators for commonly used data types:

NvDsGxfObjectDataTranslator
-----------------------------------

Translates DS object data structure (``NvDsObjectMeta``) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

================================     =================================================================================================================================================================
  Component Name                        Component Type / Details
================================     =================================================================================================================================================================
 bbox                                  ``nvidia::gxf::Tensor``, shape - ``(N, 4)``, data type - ``float``. Bounding box co-ordinates for all objects in the frame
 class-id                              ``nvidia::gxf::Tensor``, shape - ``(N, 1)``, data type - ``uint64_t``. Class ids for all objects in the frame
 confidence                            ``nvidia::gxf::Tensor``, shape - ``(N, 1)``, data type - ``float``. Detection confidence for all objects in the frame.
 object-id                             ``nvidia::gxf::Tensor``, shape - ``(N, 1)``, data type - ``int64_t``. Object tracking ids for all objects in the frame. -1 for untracked objects
 tracker-confidence                    ``nvidia::gxf::Tensor``, shape - ``(N, 1)``, data type - ``float``. Tracking confidence for all objects in the frame.
 object-label                          ``nvidia::gxf::Tensor``, shape - ``(N, L)``, data type - ``uint8_t``. String labels for all objects in the frame.
 detector-bbox                         ``nvidia::gxf::Tensor``, shape - ``(N, 4)``, data type - ``float``. Original bounding box co-ordinates generated by detector
 tracker-bbox                          ``nvidia::gxf::Tensor``, shape - ``(N, 4)``, data type - ``float``. Original bounding box co-ordinates generated by tracker
 classification-<cid>                  ``nvidia::gxf::Tensor``, shape - ``(N, M, 1)``, data type - ``uint64_t``. Indexes for the output classes for all objects in the frame
 classification-confidence-<cid>       ``nvidia::gxf::Tensor``, shape - ``(N, M, 1)``, data type - ``float``. Classification probability for the output classes for all objects in the frame
 classification-label-<cid>            ``nvidia::gxf::Tensor``, shape - ``(N, M, L)``, data type - ``uint8_t``. String labels for the output classes for all objects in the frame
 instance-segmentation                 ``nvidia::gxf::Tensor``, shape - ``(N, rows, cols)``, data type - ``float``. Raw instance segmentation outputs for all objects in the frame
 instance-segmentation-valid           ``nvidia::gxf::Tensor``, shape - ``(N, 1)``, data type - ``uint8_t``. Boolean indicating if the instance segmentation mask is valid for the object
================================     =================================================================================================================================================================

``N`` - Number of objects in the frame

``cid`` - Component ID that generated the meta data

``M`` - Number of types of classification supported by the component. > 1 for multi-label classification models

``L`` - Maximum length of string label

NvDsGxfAudioClassificationDataTranslator
--------------------------------------------


Translates DS audio classification data structure (``NvDsAudioFrameMeta/NvDsClassifierMeta``) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

========================================     =================================================================================================================================================================
  Component Name                               Component Type / Details
========================================     =================================================================================================================================================================
 audio-classification-<cid>                   ``nvidia::gxf::Tensor``, shape - ``(M, 1)``, data type - ``uint64_t``. Indexes for the output classes for the current audio frame
 audio-classification-confidence-<cid>        ``nvidia::gxf::Tensor``, shape - ``(M, 1)``, data type - ``float``. Classification probability for the output classes for the current audio frame
 audio-classification-label-<cid>             ``nvidia::gxf::Tensor``, shape - ``(M, L)``, data type - ``uint8_t``. String labels for the output classes for the current audio frame
========================================     =================================================================================================================================================================

``cid`` - Component ID that generated the meta data

``M`` - Number of types of classification supported by the component. > 1 for multi-label classification models

``L`` - Maximum length of string label

NvDsGxfOpticalFlowDataTranslator
-----------------------------------

Translates DS optical flow data structure (``NvDsOpticalFlowMeta``) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

======================     =================================================================================================================================================================
  Component Name            Component Type / Details
======================     =================================================================================================================================================================
  motion-vectors            ``nvidia::gxf::Tensor``, shape - ``(rows, columns, 2)``. data type - ``int16_t``. contains motion vectors for ``x`` and ``y`` directions.
======================     =================================================================================================================================================================

NvDsGxfSegmentationDataTranslator
-----------------------------------

Translates DS segmentation data structure (``NvDsInferSegmentationMeta``) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

==============================     ==========================================================================================================================================================================================
  Component Name                     Component Type / Details
==============================     ==========================================================================================================================================================================================
  seg-class-map                      ``nvidia::gxf::Tensor``, shape - ``(rows, columns)``, data type - ``int32``. Per-pixel map of class-ids with highest classification probability for that pixel
  seg-class-probability-matrix       ``nvidia::gxf::Tensor``, shape - ``(rows, columns, classes)``, data type - ``float``. Raw segmentation output containing classification probabilities for all pixels for all classes
==============================     ==========================================================================================================================================================================================

NvDsGxfInferTensorDataTranslator
-----------------------------------

Translates DS infer tensor data structure(``NvDsInferTensorMeta``) to Codelet data structure and vice versa. The data components produced or consumed by this translator are:

====================================     ==============================================================================================================================================================================================================================================================================================
  Component Name                           Component Type / Details
====================================     ==============================================================================================================================================================================================================================================================================================
  infer-tensor-frame-<layer-name>          ``nvidia::gxf::Tensor``. Contains data for frame-level raw inference output for layer with name layer name. Shape and data type depends on the model layer shape and data type
  infer-tensor-object-<layer-name>         ``nvidia::gxf::Tensor``. Contains data for object-level raw inference output for layer with name layer name. Shape and data type depends on the model layer shape and data type, shape has an additional highest order dimension of N (number of objects in the frame.
====================================     ==============================================================================================================================================================================================================================================================================================