Gst-nvtracker¶
This plugin allows the DS pipeline to use a low-level tracker to track the detected objects with unique IDs. It supports any low-level library that implements NvDsTracker
API, including the three reference implementations: NvDCF, KLT, and IOU trackers. As part of this API, the plugin queries the low-level library for capabilities and requirements concerning input format and memory type. Based on these queries, the plugin then converts input frame buffers into the format requested by the low-level library. For example, the KLT tracker uses Luma-only format; NvDCF uses NV12 or RGBA; and IOU requires no buffer at all.
The low-level capabilities also include support for batch processing across multiple input streams. Batch processing is typically more efficient than processing each stream independently. If a low-level library supports batch processing, that would be the preferred mode of operation. However, this preference can be overridden with enable-batch-process
configuration option if the low-level library supports both batch and per-stream modes. The low-level capabilities also include support for passing the past-frame data, which includes the object tracking data generated in the past frames but not reported as output yet. This can be the case when the low-level tracker stores the object tracking data generated in the past frames internally because of, say, low confidence, but later decided to report due to, say, increased confidence. These past-frame data are reported as a user-meta. This can be enabled by the enable-past-frame
configuration option.
The plugin accepts NV12- or RGBA-formated frame data from the upstream component and scales (converts) the input buffer to a buffer in the format required by the low-level library, with tracker width and height. (Tracker width and height must be specified in the configuration file’s [tracker] section.) The low-level tracker library is selected via ll-lib-file
configuration option in the tracker configuration section. The selected low-level library may also require its own configuration file, which can be specified via ll-config-file
option. The three reference low-level tracker libraries support different tracking algorithms:
The KLT tracker uses a CPU-based implementation of the Kanade-Lucas-Tomasi (KLT) tracker algorithm. This library requires no configuration file.
The Intersection-Over-Union (IOU) tracker uses the IOU values among the detector’s bounding boxes between the two consecutive frames to perform the association between them or assign a new ID. This library takes an optional configuration file.
The NVIDIA®-adapted Discriminative Correlation Filter (NvDCF) tracker uses a correlation filter-based online discriminative learning algorithm as a visual object tracker, while using a data association algorithm for multi-object tracking. This library accepts an optional configuration file.
Inputs and Outputs¶
This section summarizes the inputs, outputs, and communication facilities of the Gst-nvtracker plugin.
Inputs
Gst Buffer (batched)
NvDsBatchMeta
Color formats supported for the input video frame are NV12 and RGBA.
Control parameters
tracker-width
tracker-height
gpu-id
(for dGPU only)ll-lib-file
ll-config-file
enable-batch-process
enable-past-frame
tracking-surface-type
display-tracking-id
Output
Gst Buffer (provided as an input)
NvDsBatchMeta
(with addition of tracked object coordinates, tracker confidence and object IDs inNvDsObjectMeta
)
Note
If the tracker algorithm does not generate confidence value, then tracker confidence value will be set to -0.1
for tracked objects.
For KLT and IOU tracker tracker_confidence
is set to -0.1
as these algorithms do not generate confidence values for tracked objects. Nvdcf tracker generates confidence for the tracked objects, and its value is set in tracker_confidence
field in NvDsObjectMeta
structure
The following table summarizes the features of the plugin.
Feature |
Description |
Release |
---|---|---|
Configurable tracker width/height |
Frames are internally scaled to specified resolution for tracking |
DS 2.0 |
Multi-stream CPU/GPU tracker |
Supports tracking on batched buffers consisting of frames from different sources |
DS 2.0 |
NV12 Input |
— |
DS 2.0 |
RGBA Input |
— |
DS 3.0 |
Allows low FPS tracking |
IOU tracker |
DS 3.0 |
Configurable GPU device |
User can select GPU for internal scaling/color format conversions and tracking |
DS 2.0 |
Dynamic addition/deletion of sources at runtime |
Supports tracking on new sources added at runtime and cleanup of resources when sources are removed |
DS 3.0 |
Support for user’s choice of low-level library |
Dynamically loads user selected low-level library |
DS 4.0 |
Support for batch processing |
Supports sending frames from multiple input streams to the low-level library as a batch if the low-level library advertises capability to handle that |
DS 4.0 |
Support for multiple buffer formats as input to low-level library |
Converts input buffer to formats requested by the low-level library, for up to 4 formats per frame |
DS 4.0 |
Support for reporting past-frame data |
Supports reporting past-frame data if the low-level library supports the capability |
DS 5.0 |
Support for enabling tracking-id display |
Supports enabling or disabling display of tracking-id |
DS 5.0 |
Gst Properties¶
The following table describes the Gst properties of the Gst-nvtracker plugin.
Property |
Meaning |
Type and Range |
Example Notes |
---|---|---|---|
tracker-width |
Frame width at which the tracker is to operate, in pixels. |
Integer, 0 to 4,294,967,295 |
tracker-width=640 |
tracker-height |
Frame height at which the tracker is to operate, in pixels. |
Integer, 0 to 4,294,967,295 |
tracker-height=384 |
ll-lib-file |
Pathname of the low-level tracker library to be loaded by Gst-nvtracker. |
String |
ll-lib-file=/opt/nvidia/deepstream/deepstream/lib/libnvds_nvdcf.so |
ll-config-file |
Configuration file for the low-level library if needed. |
Path to configuration file |
ll-config-file=tracker_config.yml |
gpu-id |
ID of the GPU on which device/unified memory is to be allocated, and with which buffer copy/scaling is to be done. (dGPU only.) |
Integer, 0 to 4,294,967,295 |
gpu-id=0 |
enable-batch-process |
Enables/disables batch processing mode. Only effective if the low-level library supports both batch and per-stream processing. (Optional.) |
Boolean |
enable-batch-process=1 |
enable-past-frame |
Enables/disables reporting past-frame data mode. Only effective if the low-level library supports it. |
Boolean |
enable-past-frame=1 |
tracking-surface-type |
Set surface stream type for tracking. (default value is 0) |
Integer, ≥0 |
tracking-surface-type=0 |
display-tracking-id |
Enables tracking ID display on OSD. |
Boolean |
display-tracking-id=1 |
iou-threshold |
Intersection-over-union threshold for considering two bounding boxes for association (only for IOU tracker) |
float, [0..1] |
iou-threshold=0.6 |
compute-hw |
Compute engine to use for scaling. 0 - Default 1 - GPU 2 - VIC (Jetson only) |
Integer, 0-2 |
compute-hw=1 |
Custom Low-Level Library¶
To write a custom low-level tracker library, a user can implement the API defined in sources/includes/nvdstracker.h
. Parts of the API refer to sources/includes/nvbufsurface.h
.
The names of API functions and data structures are prefixed with NvMOT
, which stands for NVIDIA Multi-Object Tracker.
This is the general flow of the API from a low-level library’s perspective:
The first required function is:
NvMOTStatus NvMOT_Query ( uint16_t customConfigFilePathSize, char* pCustomConfigFilePath, NvMOTQuery *pQuery );
The plugin uses this function to query the low-level library’s capabilities and requirements before it starts any processing sessions (contexts) with the library. Queried properties include the input frame’s color format (e.g., RGBA or NV12), memory type (e.g., NVIDIA® CUDA® device or CPU-mapped NVMM), and support for batch processing.
The plugin performs this query once during initialization stage, and its results apply to all contexts established with the low-level library. If a low-level library configuration file is specified, it is provided in the query for the library to consult. The query reply structure,
NvMOTQuery
, contains the following fields:NvMOTCompute computeConfig
: Report compute targets supported by the library. The plugin currently only echoes the reported value when initiating a context.uint8_t numTransforms
: The number of color formats required by the low-level library. The valid range for this field is0
toNVMOT_MAX_TRANSFORMS
. Set this to0
if the library does not require any visual data. Note that0
does not mean that untransformed data will be passed to the library.NvBufSurfaceColorFormat colorFormats[NVMOT_MAX_TRANSFORMS]
: The list of color formats required by the low-level library. Only the firstnumTransforms
entries are valid.NvBufSurfaceMemType memType
: Memory type for the transform buffers. The plugin allocates buffers of this type to store color- and scale-converted frames, and the buffers are passed to the low-level library for each frame. Note that support is currently limited to the following types:dGPU:
NVBUF_MEM_CUDA_PINNED NVBUF_MEM_CUDA_UNIFIED
Jetson:
NVBUF_MEM_SURFACE_ARRAY
bool supportBatchProcessing
: True if the low-library support batch processing across multiple streams; otherwise false.bool supportPastFrame
: True if the low-library support outputting past frame; otherwise false.
After the query, and before any frames arrive, the plugin must initialize a context with the low-level library by calling:
NvMOTStatus NvMOT_Init( NvMOTConfig *pConfigIn, NvMOTContextHandle *pContextHandle, NvMOTConfigResponse *pConfigResponse );
The context handle is opaque outside the low-level library. In batch processing mode, the plugin requests a single context for all input streams. In per-stream processing mode, the plugin makes this call for each input stream so that each stream has its own context. This call includes a configuration request for the context. The low-level library has an opportunity to:
Review the configuration and create a context only if the request is accepted. If any part of the configuration request is rejected, no context is created, and the return status must be set to
NvMOTStatus_Error
. ThepConfigResponse
field can optionally contain status for specific configuration items.Pre-allocate resources based on the configuration.
Note
In the
NvMOTMiscConfig
structure, thelogMsg
field is currently unsupported and uninitialized.The
customConfigFilePath
pointer is only valid during the call.
Once a context is initialized, the plugin sends frame data along with detected object bounding boxes to the low-level library each time it receives such data from upstream. It always presents the data as a batch of frames, although the batch can contain only a single frame in per-stream processing contexts. Note that depending on the frame arrival timings to the tracker plugin, the composition of frame batches could either be full batch (that contains a frame from all streams) or partial batch (that contains a frame from only a subset of the streams). In either case, each batch is guaranteed to contain at most one frame from each stream.
The function call for this processing is:
NvMOTStatus NvMOT_Process(NvMOTContextHandle contextHandle, NvMOTProcessParams *pParams, NvMOTTrackedObjBatch *pTrackedObjectsBatch );, where:
pParams
is a pointer to the input batch of frames to process. The structure contains a list of one or more frames, with at most one frame from each stream. Thus, no two frame entries have the samestreamID
. Each entry of frame data contains a list of one or more buffers in the color formats required by the low-level library, as well as a list of object attribute data for the frame. Most libraries require at most one-color format.
pTrackedObjectsBatch
is a pointer to the output batch of object attribute data. It is pre-populated with a value fornumFilled
, the number of frames included in the input parameters.If a frame has no output object attribute data, it is still counted in
numFilled
and is represented with an empty list entry (NvMOTTrackedObjList
). An empty list entry has the correctstreamID
set and numFilled set to0
.Note
The output object attribute data
NvMOTTrackedObj
contains a pointer to the associated input object,associatedObjectIn
. You must set this to the associated input object only for the frame where the input object is passed in. For example:
Frame 0:
NvMOTObjToTrack
X
is passed in. The tracker assigns it ID 1, and the output object’sassociatedObjectIn
points toX
.Frame 1: Inference is skipped, so there is no input object from detector to be associated. The tracker finds Object 1, and the output object’s
associatedObjectIn
points to NULL.Frame 2:
NvMOTObjToTrack
Y
is passed in. The tracker identifies it as Object 1. The output Object 1 hasassociatedObjectIn
pointing toY
.
Depending on the capability of the low-level tracker, there could be some tracked object data generated in the past frames but stored internally without being reported due to, say, a low confidence in the past frames. If it becomes more confident in the later frames and ready to report them, then those past-frame data can be retrieved from the tracker plug-in using the following function call. Past frame data is outputted to
batch_user_meta_list
inNvDsBatchMeta
:NvMOTStatus NvMOT_ProcessPast(NvMOTContextHandle contextHandle, NvMOTProcessParams *pParams, NvDsPastFrameObjBatch *pPastFrameObjBatch );
, where:
pParams
is a pointer to the input batch of frames to process. This structure is needed to check the list of stream ID in the batch.
pPastFrameObjBatch
is a pointer to the output batch of object attribute data generated in the past frames. The data structureNvDsPastFrameObjBatch
is defined ininclude/nvds_tracker_meta.h
. It may include a set of tracking data for each stream in the input. For each object, there could be multiple past-frame data in case the tracking data is stored for multiple frames for the object.
In case that a video stream source is removed on the fly, the plugin calls the following function so that the low-level tracker lib can remove it as well. Note that this API is optional and valid only when the batch processing mode is enabled, meaning that it will be executed only when the low-level tracker lib has an actual implementation. If called, the low-level tracker lib can release any per-stream resource that it may be allocated:
void NvMOT_RemoveStreams(NvMOTContextHandle contextHandle, NvMOTStreamId streamIdMask);
When all processing is complete, the plugin calls this function to clean up the context:
void NvMOT_DeInit(NvMOTContextHandle contextHandle);
Low-Level Tracker Library Comparisons and Tradeoffs¶
DeepStream SDK provides three reference low-level tracker libraries which have different resource requirements and performance characteristics, in terms of accuracy, robustness, and efficiency, allowing the users to choose the best tracker based on their use cases and requirements. See the following table for comparison.
Tracker |
GPU Computational Load |
CPU Computational Load |
Pros |
Cons |
Best Use Cases |
---|---|---|---|---|---|
IOU |
X |
Very Low |
Light weight |
No visual features for matching, so prone to frequent tracker ID switches and failures. Not suitable for fast moving scene. |
Objects are sparsely located, with distinct sizes. Detector is expected to run every frame or very frequently (ex. every alternate frame). |
KLT |
X |
High |
Works reasonably well for simple scenes |
High CPU utilization. Susceptible to change in the visual appearance due to noise and perturbations, such as shadow, non-rigid deformation, out-of-plane rotation, and partial occlusion. Cannot work on objects with low textures. |
Objects with strong textures and simpler background. Ideal for high CPU resource availability. |
NvDCF |
Medium |
Low |
Highly robust against partial occlusions, shadow, and other transient visual changes. Less frequent ID switches. |
Slower than KLT and IOU due to increased computational complexity. Reduces the total number of streams processed. |
Multi-object, complex scenes with partial occlusion. |
NvDCF Low-Level Tracker¶
Multi-object tracking (MOT) is a key building block for a large number of intelligent video analytics (IVA) applications where analyzing the temporal changes of objects’ states is required. Learning target model and localization in a typical MOT implementation are usually carried out on per-object basis, creating a potentially large number of small CUDA kernel launches in case processed on GPU. This inherently poses challenges in maximizing GPU utilization, especially when a large number of objects from multiple video streams are expected to be tracked on a single GPU. In addition, the association of object IDs across frames for robust tracking typically entails a feature matching process where feature extraction at each candidate location is usually computationally expensive and often plays a performance bottleneck in tracking.
NvDCF is a reference implementation of the custom low-level tracker library, which supports multi-stream, multi-object tracking in a batch mode using a discriminative correlation filter (DCF) based approach for visual object tracking and a data association algorithm (such as Hungarian algorithm) based on visual and contextual data. NvDCF provides an efficient and scalable solution for robust multi-object tracking in video processing pipelines for many video streams by employing (1) a batched execution model for end-to-end DCF-based tracking operations accelerated on GPU and (2) a novel approach on utilizing the correlation response as a means of visual similarity score for matching in data association.
NvDCF is designed to employ the execution model of the MOT module as batch processing to maximize the GPU utilization despite the nature of small CUDA kernels in per-object tracking operations. The batched processing mode is applied in the entire tracking operations, including bbox cropping and scaling, feature extraction, correlation filter learning, and localization. This can be viewed as a similar model to the batched cuFFT or batched cuBLAS calls, but the difference is that the batched MOT model spans many operations in a higher level. The batch processing capability is extended from multi-object batching to the batching of multiple streams for even greater efficiency and scalability. Due to the dynamic lifetime of an object tracker, the number of object trackers at any given frame changes over time. Thus, the size of the batched MOT operation is dynamically adjusted accordingly to minimize the amount of compute load on GPU.
NvDCF allocates memory during initialization based on:
The number of streams to be processed
The maximum number of objects to be tracked per stream (denoted as
maxTargetsPerStream
in a configuration file for the NvDCF low-level library,tracker_config.yml
)
Thus, the GPU memory usage by NvDCF is linearly proportional to the total number of objects being tracked, which is (number of video streams) × (maxTargetsPerStream). Because of the pre-allocation of all the necessary memory, it is not exptected to have memory growth during runtime.
Once the number of objects being tracked reaches the configured maximum value, any new objects will be discarded until resources for some existing tracked objects are released. Note that the number of objects being tracked includes objects that are tracked in Shadow Mode (described below). Therefore, NVIDIA recommends that you make maxTargetsPerStream
large enough to accommodate the maximum number of objects of interest that may appear in a frame, as well as the objects that may have been tracked from past frames in the shadow mode. To allow NvDCF to store and report such objects tracked in shadow-mode from past frames (i.e., past-frame data), user would need to set useBufferedOutput: 1
in low-level config (e.g., tracker_config.yml
) and enable-past-frame=1
, enable-batch-process=1
under [tracker]
in the deepstream-app
config file, because the past-frame data is only supported in the batch processing mode.
DCF-based trackers typically employ an exponential moving average for temporal consistency when the optimal correlation filter is created and updated over consecutive frames. The learning rate for this moving average can be configured as filterLr
and filterChannelWeightsLr
for correlation filter and channel weights, respectively. The standard deviation for Gaussian for desired response when creating an optimal DCF filter can also be configured as gaussianSigma
.
DCF-based trackers also define a search region around the detected target location large enough for the same target to be detected in the search region in the next frame. The SearchRegionPaddingScale
property determines the size of the search region as a multiple of the diagonal of the target’s bounding box. The size of the search region would be determined as:
searchRegionWidth = w + SearchRegionPaddingScale*sqrt(w*h)
searchRegionHeight = h + SearchRegionPaddingScale*sqrt(w*h)
, where w
and h
are the width and height of the target’s bounding box, respectively.
Once the search region is defined for each target, the image patches for each of the search regions are cropped and scaled to a predefined feature image size, from which the visual features are extracted. The featureImgSizeLevel
property defines the size of the feature image. A lower value of featureImgSizeLevel
causes NvDCF to use a smaller feature size, increasing GPU performance potentially at the cost of accuracy and robustness. Consider the relationship between featureImgSizeLevel
and SearchRegionPaddingScale
when configuring the parameters. If SearchRegionPaddingScale
is increased while featureImgSizeLevel
is fixed, the number of pixels corresponding to the target in the feature images will be effectively decreased.
The minDetectorConfidence
property sets the confidence level below which object detection results are filtered out.
To achieve robust tracking, NvDCF employs two approaches to handle the false alarms from the detector at PGIE: Late Activation for handling false positives and Shadow Tracking for false negatives. Whenever a new object is detected, a new tracker is instantiated in a temporary mode, called Tentative. It must be activated to be considered as a valid target (i.e., a target in Active mode). The Tentative mode is a probationary period, whose length is defined by probationAge
. Thus, if a target has not been detected (precisely speaking, not associated with a detector bbox) for longer than earlyTerminationAge
frames during this period, the tracker for this target will be terminated prematurely.
Once a target is activated and put into Active mode, it will be put into Inactive mode if:
No matching detector input is found during the data association, or
The tracker confidence falls below a threshold defined by
minTrackerConfidence
.
The per-object tracker will be put into Active mode again if a matching detector input is found.
The length of period during which a per-object tracker is in Inactive mode is called the shadow tracking age. If it reaches the threshold defined by maxShadowTrackingAge
, the tracker will be terminated.
If the bounding box of an object being tracked goes partially out of the image frame and so its visibility falls below a predefined threshold defined by minVisibiilty4Tracking
, the tracker will also be terminated.
The state transitions of a target tracker are summarized in the following diagram:
NvDCF can generate a unique ID to some extent. If enabled by setting useUniqueID: 1
, NvDCF would generate a 32-bit random number at the initialization stage and use it as the upper 32-bit of the uint64_t
-type target IDs. The randomly generated upper 32-bit number allows the target ID to increment from a random position in the possible ID space. The initial value of the lower 32-bit of the target ID starts from 0. If disabled (which is the default value), the target ID generation would be incremented from 0
.
NvDCF employs two types of state estimators that are based on MovingAvgEstimator (MAE) and Kalman Filter (KF). Both KF and MAE have 7 states defined, which include {x, y, a, h, dx, dy, dh}
, where x and y indicate the coordinates of the top-left corner of a target bbox, while a
and h
do the aspect ratio and the height of the bbox, respectively. dx
, dy
, and dh
are the velocity of the three parameters.
The Kalman Filter used in NvDCF employs a constant velocity model for generic use. The measurement vector is defined as {x, y, a, h}
. The process noise variance for {x, y}
, {a, h}
, and {dx, dy, dh}
can be configured by kfProcessNoiseVar4Loc
, kfProcessNoiseVar4Scale
, and kfProcessNoiseVar4Vel
, respectively.
Note that from the state estimator’s point of view, there could be two different measurements: the bbox from the detector at PGIE and the bbox from the tracker. This is because NvDCF is capable of localizing targets using its own learned filter. The measurement noise variance for these two different types of measurements can be configured by kfMeasurementNoiseVar4Det
and kfMeasurementNoiseVar4Trk
.
The MAE is much simpler than the KF, so more efficient in processing. The learning rate for the moving average of the defined states can be configured by trackExponentialSmoothingLr_loc
, trackExponentialSmoothingLr_scale
, and trackExponentialSmoothingLr_velocity
.
To enhance the robustness, NvDCF allows to add the cross-correlation score to nearby objects as an additional regularization term in the filter learning process, which is referred to as instance-awareness. If enabled by setting useInstanceAwareness: 1
, the number of nearby instances and the regularization weight for each instance would be determined by the config params maxInstanceNum_ia
and lambda_ia
, respectively.
The following table summarizes the configuration parameters for an NvDCF low-level tracker.
Property |
Meaning |
Type and Range |
Example Notes |
---|---|---|---|
useUniqueID |
Enable unique ID generation scheme |
Boolean |
useUniqueID: 1 |
maxTargetsPerStream |
Max number of targets to track per stream |
Integer, 0 to 65535 |
maxTargetsPerStream: 30 |
useColorNames |
Use ColorNames feature |
Boolean |
useColorNames: 1 |
useHog |
Use Histogram-of-Oriented-Gradient (HOG) feature |
Boolean |
useHog: 1 |
useHighPrecisionFeature |
Use high-precision numerical computation in feature extraction |
Boolean |
useHighPrecisionFeature: 1 |
filterLr |
Learning rate for DCF filter in exponential moving average |
Float, 0.0 to 1.0 |
filterLr: 0.11 |
filterChannelWeightsLr |
Learning rate for weights for different feature channels in DCF |
Float, 0.0 to 1.0 |
filterChannelWeightsLr: 0.22 |
gaussianSigma |
Standard deviation for Gaussian for desired response |
Float, >0.0 |
gaussianSigma: 0.75 |
featureImgSizeLevel |
Size of a feature image |
Integer, 1 to 5 |
featureImgSizeLevel: 1 |
SearchRegionPaddingScale |
Search region size |
Integer, 1 to 3 |
SearchRegionPaddingScale: 3 |
minDetectorConfidence |
Minimum detector confidence for a valid object |
Float, -inf to inf |
minDetectorConfidence: 0.0 |
minTrackerConfidence |
Minimum detector confidence for a valid target |
Float, 0.0 to 1.0 |
minTrackerConfidence: 0.6 |
minTargetBboxSize |
Minimum bbox size for a valid target [pixel] |
Int, ≥0 |
minTargetBboxSize: 10 |
minDetectorBboxVisibilityTobeTracked |
Minimum detector bbox visibility for a valid candidate |
Float, 0.0 to 1.0 |
minDetectorBboxVisibilityTobeTracked: 0 |
minVisibiilty4Tracking |
Minimum visibility of target bounding box to be considered valid |
Float, 0.0 to 1.0 |
minVisibiilty4Tracking: 0.1 |
targetDuplicateRunInterval |
Interval in which duplicate target removal is carried out [frame] |
Int, -inf to inf |
targetDuplicateRunInterval: 5 |
minIou4TargetDuplicate |
Min IOU for two bboxes to be considered as duplicates |
Float, 0.0 to 1.0 |
minIou4TargetDuplicate: 0.9 |
useGlobalMatching |
Enable Hungarian method for data association |
Boolean |
useGlobalMatching: 1 |
usePersistentThreads |
Create data association threads once and re-use them |
Boolean |
usePersistentThreads: 0 |
maxShadowTrackingAge |
Maximum length of shadow tracking |
Integer, ≥0 |
maxShadowTrackingAge: 9 |
probationAge |
Length of probationary period |
Integer, ≥0 |
probationAge: 12 |
earlyTerminationAge |
Early termination age |
Integer, ≥0 |
earlyTerminationAge: 2 |
minMatchingScore4Overall |
Min total score for valid matching |
Float, 0.0 to 1.0 |
minMatchingScore4Overall: 0 |
minMatchingScore4SizeSimilarity |
Min bbox size similarity score for valid matching |
Float, 0.0 to 1.0 |
minMatchingScore4SizeSimilarity: 0.5 |
minMatchingScore4Iou |
Min IOU score for valid matching |
Float, 0.0 to 1.0 |
minMatchingScore4Iou: 0.1 |
minMatchingScore4VisualSimilarity |
Min visual similarity score for valid matching |
Float, 0.0 to 1.0 |
minMatchingScore4VisualSimilarity: 0.2 |
matchingScoreWeight4VisualSimilarity |
Weight for visual similarity term in matching cost function |
Float, 0.0 to 1.0 |
matchingScoreWeight4VisualSimilarity: 0.8 |
matchingScoreWeight4SizeSimilarity |
Weight for size similarity term in matching cost function |
Float, 0.0 to 1.0 |
matchingScoreWeight4SizeSimilarity: 0 |
matchingScoreWeight4Iou |
Weight for IOU term in matching cost function |
Float, 0.0 to 1.0 |
matchingScoreWeight4Iou: 0.1 |
matchingScoreWeight4Age |
Weight for tracking age term in matching cost function |
Float, 0.0 to 1.0 |
matchingScoreWeight4Age: 0.1 |
useTrackSmoothing |
Enable state estimator |
Boolean |
useTrackSmoothing: 1 |
stateEstimatorType |
Type of state estimator among {MovingAvg: 1, Kalman: 2} |
Int |
stateEstimatorType: 2 |
trackExponentialSmoothingLr_loc |
Learning rate for location |
Float, 0.0 to 1.0 |
trackExponentialSmoothingLr_loc: 0.5 |
trackExponentialSmoothingLr_scale |
Learning rate for scale |
Float, 0.0 to 1.0 |
trackExponentialSmoothingLr_scale: 0.3 |
trackExponentialSmoothingLr_velocity |
Learning rate for velocity |
Float, 0.0 to 1.0 |
trackExponentialSmoothingLr_velocity: 0.05 |
kfProcessNoiseVar4Loc |
Process noise variance for location |
Float, ≥0 |
kfProcessNoiseVar4Loc: 0.1 |
kfProcessNoiseVar4Scale |
Process noise variance for scale |
Float, ≥0 |
kfProcessNoiseVar4Scale: 0.04 |
kfProcessNoiseVar4Vel |
Process noise variance for velocity |
Float, ≥0 |
kfProcessNoiseVar4Vel: 0.04 |
kfMeasurementNoiseVar4Trk |
Measurement noise variance for tracker bbox |
Float, ≥0 |
kfMeasurementNoiseVar4Trk: 9 |
kfMeasurementNoiseVar4Det |
Measurement noise variance for detector bbox |
Float, ≥0 |
kfMeasurementNoiseVar4Det: 9 |
useBufferedOutput |
Enable storing of past-frame data in a buffer and report it back |
Boolean |
useBufferedOutput: 1 |
useInstanceAwareness |
Enable instance-awareness for multi-object tracking |
Boolean |
useInstanceAwareness: 1 |
lambda_ia |
Regularlization factor for each instance |
Float, ≥0 |
lambda_ia: 2 |
maxInstanceNum_ia |
The number of nearby object instances to use for instance-awareness |
Integer, ≥0 |
maxInstanceNum_ia: 4 |
To learn more about NvDCF Parameter tuning guide, see NvDCF Parameter Tuning Guide.
See also the Troubleshooting in NvDCF Parameter Tuning section for solutions to common problems in tracker behavior and tuning.
IOU Low-Level Tracker¶
IOU Low-Level tracker uses intersection over union between detected bounding boxes across frames to track objects. Its optional configuration file contains a single entry as iou-threshold. This configuration is best left at default value of 0.6. In certain cases where objects are sparse with low bounding box overlap, it may help to lower the threshold.
KLT Low-Level Tracker¶
KLT Low-Level tracker performs feature extraction on image on which objects are to be tracked. It also performs optical flow tracking on the features between two frames using multi resolution pyramid. The objects are tracked based on feature points lying inside the bounding box. Feature extraction is performed whenever primary detection is performed.