Gst-nvreplay#
The Gst-nvreplay plugin loads pre-generated metadata from external files and injects it into DeepStream pipelines, effectively replacing the actual inference of corresponding plugins. This enables flexible proof-of-concepts, reproducible evaluations, and efficient parameter tuning by decoupling metadata generation from the pipeline.
The plugin currently supports loading detection metadata equivalent to a primary inference engine (PGIE). It reads bounding box annotations from txt files in MOT challenge format and attaches them as NvDsObjectMeta to frames, allowing testing of downstream components (trackers, analytics modules, visualizations) without running computationally expensive inference operations.
Typical Pipeline Structure: source -> decoder -> nvstreammux -> nvreplay -> nvtracker -> nvdsosd -> sink
Inputs and Outputs#
Inputs
Gst Buffer with NV12/RGBA/I420 format (NVMM memory)
NvDsBatchMeta (must be attached by upstream components)
Detection label files in MOT challenge format
Output
Gst Buffer
NvDsObjectMeta filled with bounding box coordinates, confidence score, and class information
The frame_meta->bInferDone flag is set to TRUE, indicating detection completion
Features#
The following table summarizes the features of the plugin.
Feature |
Description |
Release |
|---|---|---|
Supports MOT detection format |
Loads pre-generated detection metadata in MOT challenge format (CSV) to replay inference functionality. |
DS 9.0 |
External metadata ingestion |
Efficiently injects externally-generated metadata into DeepStream pipelines and functions as a drop-in replacement for primary inference engine (PGIE) |
DS 9.0 |
Supports multiple streams |
Can load detection labels from multiple files corresponding to different source streams in a batch. |
DS 9.0 |
Coordinate scaling |
Automatically scales bounding box coordinates from label space to actual frame resolution. |
DS 9.0 |
Flexible file organization |
Supports both flat and nested directory structures for label files. |
DS 9.0 |
Frame interval control |
Supports skipping batches when loading labels, similar to inference interval in nvinfer. |
DS 9.0 |
Max frame number control |
Supports limiting or specifying maximum frame numbers per stream with automatic wrapping. |
DS 9.0 |
Gst Properties#
The following table describes the Gst properties of the Gst-nvreplay plugin.
Property |
Meaning |
Type and Range |
Example Notes |
|---|---|---|---|
label-dir |
Directory path containing ground truth label files |
String |
label-dir=/path/to/labels/ |
file-names |
Semicolon-separated list of label file names corresponding to each source stream. Order must match source order (source0 source1 etc.) |
String |
file-names= src0_det.txt; src1_det.txt |
max-frame-nums |
Semicolon-separated list of maximum frame numbers for each stream. If not specified uses maximum frame number from label files. Frames wrap around using modulo when exceeding this value |
String |
max-frame-nums=100;100;100 |
label-width |
Frame width in the coordinate system used by ground truth labels. Used to scale coordinates to actual frame width |
Integer, 0 to 4,294,967,295 Default: 1920 |
label-width=1920 |
label-height |
Frame height in the coordinate system used by ground truth labels. Used to scale coordinates to actual frame height |
Integer, 0 to 4,294,967,295 Default: 1080 |
label-height=1080 |
interval |
Number of consecutive batches to skip before adding labels to metadata. 0 means labels are added to every batch |
Integer, 0 to 4,294,967,295 Default: 0 |
interval=0 |
MOT Detection Format#
The plugin expects detection data in the MOT (Multiple Object Tracking) challenge format, which is a CSV format where each line represents one detection. The format is widely used in computer vision research and provides a standardized way to represent object detections.
Format Specification:
Each line in the CSV file has the following structure:
frame_num, object_id, left, top, width, height, confidence
Where:
frame_num: Frame number (1-indexed in the label file, converted to 0-indexed internally)
object_id: Always set to -1 for detection
left: Left coordinate of the bounding box (x-coordinate of top-left corner)
top: Top coordinate of the bounding box (y-coordinate of top-left corner)
width: Width of the bounding box
height: Height of the bounding box
confidence: Detection confidence score, or valid flag (range: 0.0 to 1.0)
Example:
1,-1,100.5,200.3,50.2,80.1,0.95
1,-1,300.0,150.0,60.0,90.0,0.88
2,-1,102.0,201.0,51.0,81.0,0.96
2,-1,302.5,151.2,60.5,90.5,0.89
This example shows two objects detected in frame 1 and 2 each.
Important Notes:
Frame numbers in label files start from 1, but are converted to 0-indexed internally by DeepStream
All coordinates are in the scale defined by label-width and label-height properties
The plugin automatically scales coordinates to match the actual frame resolution
If a frame has no detections, it can be omitted from the label file (the plugin handles missing frames)
Configuration#
The Gst-nvreplay plugin can be configured through the DeepStream application configuration file or through GStreamer properties.
DeepStream Configuration File Example:
When using the plugin in a DeepStream application, you must disable the primary inference engine and enable the replay plugin:
[primary-gie]
enable=0
...
[replay]
enable=1
label-dir=/path/to/labels/
file-names=source0_detection.txt;source1_detection.txt;source2_detection.txt;source3_detection.txt
max-frame-nums=100;100;100;100
label-width=1920
label-height=1080
interval=0
File Organization:
The plugin supports two directory structures:
Flat structure (all files in the label directory):
/path/to/labels/
source0_detection.txt
source1_detection.txt
source2_detection.txt
source3_detection.txt
Configuration:
file-names=source0_detection.txt;source1_detection.txt;source2_detection.txt;source3_detection.txt
Nested structure (files organized in subdirectories):
/path/to/labels/
source0/
detection.txt
source1/
detection.txt
source2/
detection.txt
source3/
source3_subdir/
detection.txt
Configuration:
file-names=source0/detection.txt;source1/detection.txt;source2/detection.txt;source3/source3_subdir/detection.txt
Metadata Generation#
The plugin generates NvDsObjectMeta for each detection in the label files with the following attributes:
rect_params: Bounding box coordinates (scaled to actual frame dimensions)
left, top, width, height: Scaled from label coordinates to frame coordinates
object_id: Set to UNTRACKED_OBJECT_ID (default: 0xFFFFFFFFFFFFFFFF)
class_id: Set to LABEL_CLASS (default: 1)
obj_label: Set to “Person” by default
confidence: Detection confidence score from the label file
unique_component_id: Set to 15 (default unique ID for the plugin)
text_params: Display parameters for object label
The plugin also sets frame_meta->bInferDone = TRUE to indicate that “inference” (in this case, label loading) has been completed for the frame.
Coordinate Scaling:
The plugin automatically handles coordinate scaling between label space and actual frame space:
scale_width = frame_width / label_width
scale_height = frame_height / label_height
All bounding box coordinates from the label file are multiplied by these scale factors to ensure correct positioning on frames of any resolution.
Source Code and Compilation#
The source code for Gst-nvreplay is located in:
deepstream/sources/gst-plugins/gst-nvreplay/
Key Files:
gstnvreplay.cpp: Main plugin implementationgstnvreplay.h: Plugin header and data structuresgstnvreplay_lib.h: Library interface headerREADME: Detailed usage instructions and examples
Compilation:
To compile and install the plugin:
cd deepstream/sources/gst-plugins/gst-nvreplay/
make && sudo -E make install
Customization#
To customize the plugin for specific use cases, you may need to modify the source code:
Multi-class Support: Modify attach_metadata_full_frame() in gstnvreplay.cpp to parse class IDs from label files. The current implementation assigns all objects to the class ID specified by LABEL_CLASS (default: 1).
Object Label Customization: Change the LABEL_CLASS and LABEL_NAME constants to customize the default object class ID and label:
#define LABEL_CLASS 1 #define LABEL_NAME "Face"
Support for Additional Formats: Extend load_replay_data() function to parse label formats other than MOT (e.g., COCO JSON, KITTI format, or custom formats).
After making modifications, recompile and reinstall the plugin.