Configuration
For more context on configuration, when the microservice is used:
In the Multi-Target Multi-Camera Tracking (MTMC) app, please refer to its Operation Parameters section.
In the Real Time Location System (RTLS) app, please refer to its Operation Parameters section.
As a standalone microservice, refer to the
README.md
in its respective directory withinmetropolis-apps-standalone-deployment/modules/
.
App Config
App Config in JSON
{
"io": {
"enableDebug": false,
"inMtmcPlusBatchMode": false,
"batchId": "1",
"selectedSensorIds": [],
"outputDirPath": "results",
"videoDirPath": "metropolis-apps-data/videos/mtmc-app",
"jsonDataPath": "metropolis-apps-data/playback/mtmc_buildingK_playback.json",
"protobufDataPath": "",
"groundTruthPath": "",
"groundTruthFrameIdOffset": 1,
"useFullBodyGroundTruth": false,
"use3dEvaluation": false,
"plotEvaluationGraphs": false
},
"preprocessing": {
"filterByRegionsOfInterest": false,
"timestampThreshMin": 120,
"locationBboxBottomGapThresh": 0.02,
"locationConfidenceThresh": 0.5,
"locationBboxAreaThresh": 0.0008,
"locationBboxAspectRatioThresh": 0.6,
"embeddingBboxBottomGapThresh": 0.02,
"embeddingConfidenceThresh": 0.5,
"embeddingBboxAreaThresh": 0.0008,
"embeddingBboxAspectRatioThresh": 0.6,
"embeddingVisibilityThresh": 0.5,
"behaviorConfidenceThresh": 0.45,
"behaviorBboxAreaThresh": 0.0007,
"behaviorBboxAspectRatioThresh": 0.75,
"behaviorLengthThreshSec": 0.0,
"shortBehaviorFinishThreshSec": 1.0,
"behaviorNumLocationsMax": 9000,
"behaviorSplitThreshSec": 6,
"behaviorRetentionInStateSec": 600.0,
"mtmcPlusRetentionInStateSec": 10.0,
"mtmcPlusInitBufferLenSec": 10.0,
"mtmcPlusReinitRatioAssignedBehaviors": 0.75,
"mtmcPlusReinitDiffRatioClusters": null
},
"localization": {
"rectifyBboxByCalibration": false,
"peopleHeightMaxLengthSec": 600,
"peopleHeightNumSamplesMax": 1000,
"peopleHeightNumBatchFrames": 10000,
"peopleHeightEstimationRatio": 0.7,
"peopleHeightVisibilityThresh": 0.8,
"overwrittenPeopleHeightMeter": null
},
"clustering": {
"clusteringAlgo": "HDBSCAN",
"overwrittenNumClusters": null,
"agglomerativeClusteringDistThresh": 3.5,
"hdbscanMinClusterSize": 5,
"numReassignmentIterations": 4,
"reassignmentDistLooseThresh": 1.0,
"reassignmentDistTightThresh": 0.12,
"spatioTemporalDistLambda": 0.15,
"spatioTemporalDistType": "Hausdorff",
"spatioTemporalDirMagnitudeThresh": 0.5,
"enableOnlineSpatioTemporalConstraint": true,
"onlineSpatioTemporalDistThresh": 15.0,
"suppressOverlappingBehaviors": false,
"meanEmbeddingsUpdateRate": 0.1,
"skipAssignedBehaviors": true,
"enableOnlineDynamicUpdate": false,
"dynamicUpdateAppearanceDistThresh": 0.2,
"dynamicUpdateSpatioTemporalDistThresh": 10.0,
"dynamicUpdateLengthThreshSec": 9.0
},
"streaming": {
"kafkaBootstrapServers": "mdx-kafka-cluster-kafka-brokers:9092",
"kafkaProducerLingerMs": 0,
"kafkaMicroBatchIntervalSec": 60.0,
"kafkaRawConsumerPollTimeoutMs": 10000,
"kafkaNotificationConsumerPollTimeoutMs": 100,
"kafkaConsumerMaxRecordsPerPoll": 100000,
"sendEmptyMtmcPlusMessages": true,
"mtmcPlusFrameBatchSizeMs": 180,
"mtmcPlusBehaviorBatchesConsumed": 4,
"mtmcPlusFrameBufferResetSec": 4.0,
"mtmcPlusTimestampDelayMs": 100,
"mtmcPlusLocationWindowSec": 1.0,
"mtmcPlusSmoothingWindowSec": 1.0,
"mtmcPlusNumProcessesMax": 8
}
}
Instructions for Fine-Tuning App Config
Key areas for parameter fine-tuning:
Behavior Pre-processing: Adjust data quality and behavior retention for streaming.
Localization: Enhance tracking by addressing occlusions and estimating person height.
Clustering: Configure clustering algorithms and manage overlapping behaviors.
Streaming (Kafka): Control the duration of micro-batches for streaming.
Note
Pre-processing, localization, and clustering parameters can be updated in real-time via API. For more information, see here.
1. Behavior Pre-processing (Filtering)
Fine-tuning the parameters for behavior pre-processing, particularly the filtering process, involves adjusting various thresholds based on location, embeddings, and behavior. It’s important to remember the following:
Thresholds for the size or area of bounding boxes should be considered relative to the overall frame size or area.
The
filterByRegionsOfInterest
option allows for filtering based on predefined regions of interest established during the calibration phase.
Key groups of parameters:
Location-based Thresholds: These are crucial for filtering ground plane trajectories, which are used to calculate spatio-temporal distances. Relying too much on omitting locations can cause the algorithm to depend more heavily on appearance features.
Embedding-based Thresholds: These help to filter out feature embeddings that represent object appearances.
Behavior-based Thresholds: These thresholds have a direct impact on behavior analysis. Increasing these thresholds may reduce outliers during the clustering process.
The behaviorRetentionInStateSec
parameter indicates how long (in seconds) a behavior is maintained in the system’s state. If a behavior ends before this timeframe in the current micro batch, it is removed from the state. A longer retention time means more historical data is kept, potentially increasing accuracy but requiring more memory and processing power for clustering. For the Multi-Camera Fusion - MTMC microservice, it’s best to limit retention time to the shorter of two figures: either the maximum predicted time an object could disappear from all cameras before reappearing, or the maximum time an object is expected to be tracked across multiple cameras in a semi-online mode. For the Multi-Camera Fusion - RTLS microservice, a shorter retention time is recommended to facilitate real-time processing. For more information on prolonged durations, refer to Query-by-Example.
Similarly, the mtmcPlusRetentionInStateSec
parameter defines how long (in seconds) an MTMC plus object is retained in the system. A longer retention period allows the online tracking algorithm to utilize more spatio-temporal information for matching behaviors to improve accuracy. However, keeping this value low is advised for real-time processing efficiency.
For effective online tracking in RTLS, it’s necessary to initialize the MTMC plus state early on to enable Hungarian matching in subsequent batches. Behaviors are accumulated until the mtmcPlusInitBufferLenSec
threshold is reached. If the initial object locations are unsatisfactory, increasing this buffer length may improve tracking accuracy.
In the RTLS microservice, the MTMC plus state can be re-initialized to adapt to dynamic changes in the following circumstances.
The ratio of matched behaviors in the previous batch falls below a specified threshold.
The number of clusters deviates significantly from a pre-defined number, being either too large or too small.
To minimize the frequency of MTMC plus state re-initializations, it is advisable to adjust the parameters mtmcPlusReinitRatioAssignedBehaviors
and mtmcPlusReinitDiffRatioClusters
. Specifically, reducing the former and/or increasing the latter can be effective, provided that overwrittenNumClusters
is set accordingly. It is important to note that re-initialization involves running clustering and iterative Hungarian re-assignment processes, which may momentarily interrupt online tracking.
2. Localization
For addressing occlusions, rectifyBboxByCalibration
can be enabled. By default, the system uses the “foot position” (center point of the lower bounding box edge) to determine an individual’s location in a 3D environment. In cases of occlusion where only the upper body is visible, enabling calibration-based rectification is useful.
Upon enabling rectifyBboxByCalibration
, the system:
- In the camera view, computes “head position” (x_head, y_head)
at the top bounding box edge’s center, and “foot position” (x_foot, y_foot)
at the bottom edge’s center.
- Projects the “head position” to the Z=people_height
plane in the 3D world, determining the “foot position” in 3D as (X, Y, 0)
.
- Projects this “foot position” back to the camera view (x_foot_estimated, y_foot_estimated)
.
- Computes visibility as min (1, (y_foot-y_head)/(y_foot_estimated-y_head))
.
- If visibility falls below peopleHeightVisibilityThresh
, the bounding box is adjusted.
The system can estimate the average height of people by either collecting data at the start or by using a pre-defined height if overwrittenPeopleHeightMeter
is set. Related parameters include:
rectifyBboxByCalibration
: To activate the calibration-based rectification.
peopleHeightMaxLengthSec
: Max duration for initial data collection to estimate height in streaming mode.
peopleHeightNumSamplesMax
: Max number of bounding boxes for initial height estimation in streaming mode.
peopleHeightNumBatchFrames
: Max frames for initial data collection in batch mode.
peopleHeightEstimationRatio
: Portion of data used for height estimation.
peopleHeightVisibilityThresh
: Visibility threshold for bounding box adjustments.
overwrittenPeopleHeightMeter
: Manually set height value to bypass system estimation.
3. Clustering
Our system offers support for two main clustering algorithms through the clusteringAlgo
parameter. Depending on your selection:
HDBSCAN: Use
hdbscanMinClusterSize
to fine-tune accuracy. Generally, a larger minimum cluster size results in fewer, but potentially more meaningful, output clusters.AgglomerativeClustering: Adjust
agglomerativeClusteringDistThresh
to achieve the best clustering outcomes. A higher distance threshold tends to yield a smaller number of output clusters.
For effective parameter tuning, start with a representative micro batch to assess the total count of global IDs. Adjust these parameters until you reach the desired cluster count. If the number of resulting clusters is less than the maximum number of co-existing behaviors, the system will automatically make corrections. The overwrittenNumClusters
parameter allows for the direct setting of the cluster count for the agglomerative clustering algorithm.
Depending on the robustness of the re-identification features, adjust the reassignmentDistLooseThresh
, reassignmentDistTightThresh
accordingly to suppress ID switches. The spatioTemporalDistLambda
, spatioTemporalDirMagnitudeThresh
and spatioTemporalDistType
are used to control the factor of spatio-temporal distance when combining with apperance distance for Hungarian matching.
In RTLS microservice, to maintain continuous and smooth object locations, set enableOnlineSpatioTemporalConstraint
to true. This ensures that only behaviors within a certain distance (onlineSpatioTemporalDistThresh
) can be matched to an MTMC plus object in the state. The meanEmbeddingsUpdateRate
controls how frequently the appearance embeddings of each MTMC plus object are updated upon matching with new behaviors. To keep online matching swift and support real-time processing, you can opt to skip behaviors already assigned in the current batch by enabling skipAssignedBehaviors
.
Furthermore, enableOnlineDynamicUpdate
in RTLS microservice allows the system to handle dynamic change of MTMC plus state objects and adapt to objects entering or exiting the scene during online tracking. Shadow MTMC plus state objects are created for unmatched behaviors. These shadow objects will be merged if their appearance distance and spatio-temporal distance fall within dynamicUpdateAppearanceDistThresh
and dynamicUpdateSpatioTemporalDistThresh
, respectively. These shadow objects become normal ones if their accumulated length exceeds dynamicUpdateLengthThreshSec
.
Additional parameters include:
numReassignmentIterations
: Specifies the number of iterations for re-assigning co-existing behaviors using the Hungarian algorithm. More iterations can improve accuracy but may increase computation time.
reassignmentDistThresh
: Sets the distance threshold for re-assigning a behavior to a cluster, using Hungarian matching, with values ranging from 0.0 to 1.0.
spatioTemporalDistLambda
: Balances how normalized spatio-temporal distances are combined with appearance-based distances for re-assigning co-existing behaviors.
spatioTemporalDistType
: Offers two types of distance calculations: “Hausdorff” and “pairwise”, with the latter being more computationally efficient.
suppressOverlappingBehaviors
: Controls the suppression of overlapping behaviors via linear programming. Disabling this feature may increase the algorithm’s adaptability and accuracy.
4. Streaming (Kafka)
The pivotal parameter for streaming in Kafka is kafkaMicroBatchIntervalSec
. This defines the duration of each micro-batch. Once a micro batch’s raw data is received, it’s pre-processed into behaviors which are then merged with those already in the state. The live behaviors from the current state are then used for clustering, leading to MTMC object creation.
Considerations: - Micro-batch intervals: Shorter intervals guarantee faster and regular UI updates, but can also lead to fragmented global IDs. - Processing times: If intervals are too brief, resulting in processing durations surpassing these intervals, outputs might lag. - Computation cost: Clustering happens for every micro-batch. Hence, multiple smaller batches could be more resource-intensive than a longer one.
In the RTLS microservice, the trajectories of each MTMC plus object can be smoothed by adjusting two configuration parameters: mtmcPlusLocationWindowSec
and mtmcPlusSmoothingWindowSec
. The mtmcPlusLocationWindowSec
parameter (default: 1.0 second) aggregates individual locations from all sensors to calculate the current global location, while the mtmcPlusSmoothingWindowSec
parameter (default: 1.0 second) aggregates these global locations over a temporal window to calculate an average, smoothing the trajectories. Increasing these parameters will introduce a delay in the actual locations received at the Kafka consumer. The delay introduced is (mtmcPlusLocationWindowSec
+ mtmcPlusSmoothingWindowSec
) / 2 seconds, resulting in a default delay of 1 second. These parameters should be minimized to reduce the delay introduced for RTLS display. To address specific issues in difficult scenarios, increase mtmcPlusLocationWindowSec
to reduce “ghosting dots`` (flashing locations) and increase mtmcPlusSmoothingWindowSec
to reduce jittering trajectories.
It is also recommended to use larger mtmcPlusNumProcessesMax
depending on the available CPU cores.
Note
Out of the above 4 mtmc config categories, the following 3 category’s configs can be dynamically updated during runtime: preprocessing
, localization
and clustering
. These configs can be updated by using the /config/update/:docType
analytics API endpoint (docType
will have the value mdx-mtmc-analytics
). For more details check the open-api spec.
App Config Details
Name |
Category |
Type |
Default |
Range |
Description |
---|---|---|---|---|---|
enableDebug |
io |
bool |
False |
True or False |
If true, save intermediate results, i.e., frames, behaviors, and MTMC objects in JSON format that can be used for visualization, during MTMC tracking. In RTLS, this flag can add |
inMtmcPlusBatchMode |
io |
bool |
False |
True or False |
If true, use the maximum timestamp in each batch as the current timestamp, only for RTLS batch processing. Set this parameter to false if the current timestamp is available and accurate. |
batchId |
io |
str |
“1” |
The pre-defined batch ID for MTMC batch processing. |
|
selectedSensorIds |
io |
list |
[] |
The selected sensor IDs to be processed. If empty, all the sensors are processed. |
|
outputDirPath |
io |
str |
The output directory for saving files. |
||
videoDirPath |
io |
str |
The directory of input videos. |
||
jsonDataPath |
io |
str |
The input raw data file in JSON format. |
||
protobufDataPath |
io |
str |
The input raw data file in protobuf format. |
||
groundTruthPath |
io |
str |
The input ground truth file in the format of MOTChallenge. If not found, the evaluation is not conducted. |
||
groundTruthFrameIdOffset |
io |
int |
1 |
The offset of frame IDs in the ground truth in comparison with the raw data. |
|
useFullBodyGroundTruth |
io |
bool |
False |
True or False |
If true, use full-body bounding boxes recovered from estimated foot points for evaluation. |
use3dEvaluation |
io |
bool |
False |
True or False |
If true, use projected foot points on the ground plane in 3D for evaluation. |
plotEvaluationGraphs |
io |
bool |
False |
True or False |
If true, plot the evaluation graphs in the output directory. |
filterByRegionsOfInterest |
preprocessing |
bool |
False |
True or False |
If true, filter the behaviors and corresponding embeddings and locations by the regions of interest in calibration. |
timestampThreshMin |
preprocessing |
Optional[float] |
None |
>= 0 or None |
The timestamp threshold in minute, which is used to filter away old frames in raw data. It is disabled when the value is None. |
locationBboxBottomGapThresh |
preprocessing |
float |
0.02 |
>= 0 and <= 1 |
The threshold for filtering locations based on the gap between the bounding box’s bottom and the bottom of the frame image. It is a ratio against the frame height. Locations whose corresponding bounding boxes’ bottom gaps are smaller than this threshold are filtered away. |
locationConfidenceThresh |
preprocessing |
float |
0.5 |
>= 0 and <= 1 |
The detection confidence threshold for filtering locations. Locations whose corresponding detection confidences are smaller than this threshold are filtered away. |
locationBboxAreaThresh |
preprocessing |
float |
0.0008 |
>= 0 and <= 1 |
The bounding box area threshold for filtering locations. It is a ratio against the frame area. Locations whose corresponding bounding boxes’ areas are smaller than this threshold are filtered away. |
locationBboxAspectRatioThresh |
preprocessing |
float |
0.6 |
>= 0 |
The bounding box aspect ratio threshold for filtering locations. Locations whose corresponding bounding boxes’ aspect ratios are larger than this threshold are filtered away. |
embeddingBboxBottomGapThresh |
preprocessing |
float |
0.02 |
>= 0 and <= 1 |
The threshold for filtering embeddings based on the gap between the bounding box’s bottom and the bottom of the frame image. It is a ratio against the frame height. Embeddings whose corresponding bounding boxes’ bottom gaps are smaller than this threshold are filtered away. |
embeddingConfidenceThresh |
preprocessing |
float |
0.5 |
>= 0 and <= 1 |
The detection confidence threshold for filtering embeddings. Embeddings whose corresponding detection confidences are smaller than this threshold are filtered away. |
embeddingBboxAreaThresh |
preprocessing |
float |
0.0008 |
>= 0 and <= 1 |
The bounding box area threshold for filtering embeddings. It is a ratio against the frame area. Embeddings whose corresponding bounding boxes’ areas are smaller than this threshold are filtered away. |
embeddingBboxAspectRatioThresh |
preprocessing |
float |
0.6 |
>= 0 |
The bounding box aspect ratio threshold for filtering embeddings. Embeddings whose corresponding bounding boxes’ aspect ratios are larger than this threshold are filtered away. |
embeddingVisibilityThresh |
preprocessing |
float |
0.5 |
>= 0 and <= 1 |
The bounding box visibility threshold for filtering embeddings. Embeddings whose corresponding bounding boxes’ visibilities are smaller than this threshold are filtered away. |
behaviorConfidenceThresh |
preprocessing |
float |
0.45 |
>= 0 and <= 1 |
The detection confidence threshold for filtering behaviors. Behaviors whose corresponding mean of detection confidences are smaller than this threshold are filtered away. |
behaviorBboxAreaThresh |
preprocessing |
float |
0.0007 |
>= 0 and <= 1 |
The bounding box area threshold for filtering behaviors. It is a ratio against the frame area. Behaviors whose corresponding mean of bounding boxes’ areas are smaller than this threshold are filtered away. |
behaviorBboxAspectRatioThresh |
preprocessing |
float |
0.75 |
>= 0 |
The bounding box aspect ratio threshold for filtering behaviors. Behaviors whose corresponding mean of bounding boxes’ aspect ratios are larger than this threshold are filtered away. |
behaviorLengthThreshSec |
preprocessing |
float |
0.0 |
>= 0 |
The behavior length threshold in second for filtering behaviors. Behaviors whose corresponding lengths are smaller than this threshold are filtered away. The |
shortBehaviorFinishThreshSec |
preprocessing |
Optional[float] |
None |
>= 0 or None |
The threshold in minute for filtering away short behaviors (under |
behaviorNumLocationsMax |
preprocessing |
int |
9000 |
>= 0 |
The maximum number of locations for a behavior. If the number of locations is above this threshold, the locations are sampled. |
behaviorSplitThreshSec |
preprocessing |
int |
6 |
>= 0 |
The threshold in second to split a behavior if the gap between timestamps is above this value. |
behaviorRetentionInStateSec |
preprocessing |
float |
600.0 |
>= 0 |
The retention time limit in second for the behavior records in state, ignored in MTMC batch processing. |
mtmcPlusRetentionInStateSec |
preprocessing |
float |
10.0 |
>= 0 |
The retention time limit in second for the MTMC plus records in state, ignored in MTMC microservice. |
mtmcPlusInitBufferLenSec |
preprocessing |
float |
10.0 |
>= 0 |
The length of the buffer in second for initializing MTMC plus state at RTLS microservice. |
mtmcPlusReinitRatioAssignedBehaviors |
preprocessing |
float |
0.75 |
>= 0 and <= 1 |
The minimum ratio of assigned behaviors to trigger re-initialization of MTMC plus state, ignored in MTMC microservice. |
mtmcPlusReinitDiffRatioClusters |
preprocessing |
Optional[float] |
None |
>= 0 and <= 1 or None |
The maximum ratio of difference in the number of clusters compared to |
rectifyBboxByCalibration |
localization |
bool |
False |
True or False |
The flag to enable the calibration-based rectification of bounding boxes by estimating people’s height. |
peopleHeightMaxLengthSec |
localization |
int |
600 |
> 0 |
The max time duration in second for collecting data to estimate people’s height at start. The estimation of people’s height is conducted when either the condition of |
peopleHeightNumSamplesMax |
localization |
int |
1000 |
> 0 |
The max number of bounding boxes for collecting data to estimate people’s height at start. The estimation of people’s height is conducted when either the condition of |
peopleHeightNumBatchFrames |
localization |
int |
10000 |
> 0 |
The max number of frames for collecting data to estimate people’s height at start, ignored in stream processing. |
peopleHeightEstimationRatio |
localization |
float |
0.7 |
> 0 and <= 1 |
The potion of collected data to be used for people height estimation. The smaller people’s heights are likely to be from occluded instances, and thus this parameter is used to filter them away. |
peopleHeightVisibilityThresh |
localization |
float |
0.8 |
>= 0 and <= 1 |
The bounding box visibility threshold for rectifying bounding boxes. Bounding boxes whose corresponding visibilities are smaller than this threshold are rectified. |
overwrittenPeopleHeightMeter |
localization |
Optional[float] |
1.8 |
> 0 or None |
The people’s height in meter for overwriting the estimated people’s height. |
clusteringAlgo |
clustering |
str |
“HDBSCAN” |
[“HDBSCAN”, “AgglomerativeClustering”] |
The choice of clustering algorithm, which can be chosen from “HDBSCAN” and “AgglomerativeClustering”. |
overwrittenNumClusters |
clustering |
Optional[int] |
None |
> 0 or None |
The number of clusters for overwriting the clustering results of agglomerative clustering, used when |
agglomerativeClusteringDistThresh |
clustering |
float |
3.5 |
> 0 |
The distance threshold for agglomerative clustering, used when |
hdbscanMinClusterSize |
clustering |
int |
5 |
>= 2 |
The minimum size of clusters for HDBSCAN, i.e., the minimum number of behaviors that should be included in each cluster, used when |
numReassignmentIterations |
clustering |
int |
4 |
>= 0 |
The number of iterations for re-assignment of co-existing behaviors based on the Hungarian algorithm. More iterations usually result in better accuracy, but it requires more computation time. |
reassignmentDistLooseThresh |
clustering |
float |
1.0 |
>= 0 and <= 1 |
The distance threshold (combination of appearance distance and spatio-temporal distance) for re-assigning behaviors to clusters during Hungarian matching. Only when a distance is smaller than this threshold, an assignment can be made. This threshold is used for both MTMC and RTLS modes. |
reassignmentDistTightThresh |
clustering |
float |
0.12 |
>= 0 and <= 1 |
When a distance (combination of appearance distance and spatio-temporal distance) during re-assignment is smaller than this threshold, force the behavior to be assigned to the corresponding cluster, which can be used to correct ID switches in single-camera matching. This threshold is used in RTLS microservice only. |
spatioTemporalDistLambda |
clustering |
float |
0.1 |
>= 0 and <= 1 |
The lambda of (normalized) spatio-temporal distance to integrate with appearance-based distance for the re-assignment of co-existing behaviors. A larger value indicates that the spatio-temporal distance is given more weight. |
spatioTemporalDirMagnitudeThresh |
clustering |
float |
0.5 |
>= 0 |
The spatio-temporal distance is enhanced by a direction influence. The direction influence is applied when the magnitude of direction vector is larger than this threshold. The unit is meter, or the corresponding unit used in calibration. |
spatioTemporalDistType |
clustering |
str |
“Hausdorff” |
[“Hausdorff”, “pairwise”] |
The type of spatio-temporal distance, which can be chosen from “Hausdorff” and “pairwise”. |
enableOnlineSpatioTemporalConstraint |
clustering |
bool |
False |
True or False |
The flag to enable spatio-temporal constraint to yield continuous and smooth locations, used in RTLS microservice. |
onlineSpatioTemporalDistThresh |
clustering |
Optional[float] |
None |
> 0 or None |
The hard spatio-temporal distance threshold for limiting the assignment of behaviors when |
suppressOverlappingBehaviors |
clustering |
bool |
False |
True or False |
The flag to enable suppression of overlapping behaviors based on linear programming. Although overlapping behaviors are due to clustering failures, disabling this feature usually gives more flexibility to the algorithm and yields higher accuracy. |
meanEmbeddingsUpdateRate |
clustering |
float |
0.1 |
>= 0 and <= 1 |
The ratio of mean embeddings for each MTMC plus object in the state to be updated upon matching with new behaviors, used in RTLS microservice. A higher value will make the appearance more adaptive to changes in the scene. |
skipAssignedBehaviors |
clustering |
bool |
True |
True or False |
The flag to enable skipping assigned behaviors in the current batch to support real-time processing in RTLS microservice. |
enableOnlineDynamicUpdate |
clustering |
bool |
True |
True or False |
The flag to enable dynamic update of MTMC plus objects in the state during online tracking in RTLS microservice. This is usually used to handle entering and exiting objects in the scene. |
dynamicUpdateAppearanceDistThresh |
clustering |
float |
0.2 |
>= 0 and <= 1 |
The appearance distance threshold for merging temporary MTMC plus objects in the state when |
dynamicUpdateSpatioTemporalDistThresh |
clustering |
float |
10.0 |
> 0 |
The spatio-temporal distance threshold for merging temporary MTMC plus objects in the state when |
dynamicUpdateLengthThreshSec |
clustering |
float |
9.0 |
> 0 |
The length threshold in second for converting temporary MTMC plus objects in the state to permanent ones when |
kafkaBootstrapServers |
streaming |
str |
“localhost:9092” |
A comma-separated list of host-port pairs that are the addresses of the Kafka brokers in a “bootstrap” Kafka cluster that a Kafka client connects initially to bootstrap itself, ignored in MTMC batch processing. |
|
kafkaProducerLingerMs |
streaming |
int |
0 |
>= 0 |
The time in millisecond to wait before sending messages out to Kafka, ignored in MTMC batch processing. |
kafkaMicroBatchIntervalSec |
streaming |
float |
60.0 |
>= 0 |
The time interval in second for each micro batch, ignored in MTMC batch processing. The filter of time duration in the web UI needs to be larger than this value to have events displayed. |
kafkaRawConsumerPollTimeoutMs |
streaming |
int |
10000 |
>= 0 |
The timeout in millisecond to poll |
kafkaNotificationConsumerPollTimeoutMs |
streaming |
int |
100 |
>= 0 |
The timeout in millisecond to poll |
kafkaConsumerMaxRecordsPerPoll |
streaming |
int |
100000 |
>= 0 |
The maximum records per poll, ignored in MTMC batch processing. |
sendEmptyMtmcPlusMessages |
clustering |
bool |
True |
True or False |
The flag to allow empty |
mtmcPlusFrameBatchSizeMs |
streaming |
int |
180 |
>= 0 |
The frame batch size in millisecond, used in RTLS microservice. |
mtmcPlusBehaviorBatchesConsumed |
streaming |
int |
4 |
>= 1 |
The behavior batches consumed in RTLS microservice. |
mtmcPlusFrameBufferResetSec |
streaming |
float |
4.0 |
>= 0 |
The time in second for resetting the frame buffer, used in RTLS microservice. |
mtmcPlusTimestampDelayMs |
streaming |
int |
100 |
>= 0 |
The time in millisecond to delay the timestamps for synchronizing behaviors from multiple processed, used in RTLS microservice. |
mtmcPlusLocationWindowSec |
streaming |
float |
1.0 |
>= 0 |
The time window in second to aggregate the matched behaviors’ locations and compute the location of each MTMC plus object, used in RTLS microservice. |
mtmcPlusSmoothingWindowSec |
streaming |
float |
1.0 |
>= 0 |
The time window in second to smoothen the locations of each MTMC plus object, used in RTLS microservice. |
mtmcPlusNumProcessesMax |
streaming |
int |
8 |
> 0 |
The max number of processes to run behavior pre-processing in RTLS microservice. This config’s value is determined by the number of cores that are available in the system and the number of partitions assigned to Kafka topic |
Viz MTMC Config
Viz Config in JSON
{
"setup": {
"vizMode": "mtmc_objects",
"vizMtmcObjectsMode": "grid",
"enableMultiprocessing": false,
"ffmpegRequired": false
},
"io": {
"selectedSensorIds": [],
"selectedBehaviorIds": [],
"selectedGlobalIds": [],
"outputDirPath": "results",
"videoDirPath": "metropolis-apps-data/videos/mtmc-app",
"mapPath": "images/building=Nvidia-Bldg-K-Map.png",
"framesPath": "results/frames.json",
"behaviorsPath": "results/behaviors.json",
"mtmcObjectsPath": "results/mtmc_objects.json",
"groundTruthPath": ""
},
"plotting": {
"gridLayout": [2, 2],
"blankOutEmptyFrames": false,
"vizFilteredFrames": true,
"outputFrameHeight": 1080,
"tailLengthMax": 200,
"smoothingTailLengthThresh": 5,
"smoothingTailWindow": 30
}
}
Categorization of Config Parameters
MTMC visualization configuration parameters are categorized as:
Setup Parameters:
vizMode
,vizMtmcObjectsMode
,enableMultiprocessing
, andffmpegRequired
Input/Output Parameters:
selectedSensorIds
,selectedBehaviorIds
,selectedGlobalIds
,outputDirPath
,videoDirPath
,mapPath
,framesPath
,behaviorsPath
,mtmcObjectsPath
, andgroundTruthPath
Plotting Parameters:
gridLayout
,blankOutEmptyFrames
,vizFilteredFrames
,outputFrameHeight
,tailLengthMax
,smoothingTailLengthThresh
, andsmoothingTailWindow
Viz Config Details
Name |
Category |
Type |
Default |
Range |
Description |
---|---|---|---|---|---|
vizMode |
setup |
str |
“mtmc_objects” |
[“frames”, “behaviors”, “mtmc_objects”, “ground_truth_bboxes”, “ground_truth_locations”] |
The choice of visualization mode, which can be chosen from “frames”, “behaviors”, “mtmc_objects”, “ground_truth_bboxes”, and “ground_truth_locations”. |
vizMtmcObjectsMode |
setup |
str |
“grid” |
[“grid”, “sequence”, “topview”] |
The choice of visualization mode for MTMC objects, which can be chosen from “grid”, “sequence”, and “topview”, used when |
enableMultiprocessing |
setup |
bool |
False |
True or False |
The flag to enable multi-processing when plotting the output. It may cause the system to get stuck when the number of parallel processes is too large. |
ffmpegRequired |
setup |
bool |
False |
True or False |
The flag to enable conversion of the output videos from MPEG-4 to H.264 format. |
selectedSensorIds |
io |
list |
[] |
The selected sensor IDs to be plotted. If empty, all the sensors are plotted. |
|
selectedBehaviorIds |
io |
list |
[] |
The selected behavior IDs to be plotted. If empty, all the behaviors are plotted. |
|
selectedGlobalIds |
io |
list |
[] |
The selected global IDs to be plotted. If empty, all the MTMC objects are plotted. |
|
outputDirPath |
io |
str |
The directory for saving output videos. |
||
videoDirPath |
io |
str |
The directory of input videos, ignored when |
||
mapPath |
io |
str |
The path to the input map image for top-view visualization of MTMC objects, used when |
||
framesPath |
io |
str |
The path to the input frames’ data in JSON format. |
||
behaviorsPath |
io |
str |
The path to the input behaviors’ data in JSON format. |
||
mtmcObjectsPath |
io |
str |
The path to the input MTMC objects’ data in JSON format. |
||
groundTruthPath |
io |
str |
The path to the ground truth in MOTChallenge format, used when |
||
gridLayout |
plotting |
list |
[2, 2] |
2 positive integers |
The grid layout for visualizing MTMC objects in the grid mode. |
blankOutEmptyFrames |
plotting |
bool |
False |
True or False |
If true, blank out frame images where no object presents, used when |
vizFilteredFrames |
plotting |
bool |
False |
True or False |
If true, visualize frames that have been processed by filtering. |
outputFrameHeight |
plotting |
int |
-1 |
> 0 or -1 |
The frame height for the output videos. If -1, the original frame height is used. |
tailLengthMax |
plotting |
int |
200 |
>= 0 |
The maximum length (number of frames) for plotting tails, i.e., past trajectories. |
smoothingTailLengthThresh |
plotting |
int |
5 |
>= 0 |
The threshold for the tail length (number of frames) to apply smoothing. The tails shorter than this length are not smoothed. |
smoothingTailWindow |
plotting |
int |
30 |
>= 0 |
The window (number of frames) for smoothing the tails. |
Viz RTLS Config
Viz Config in JSON
{
"input": {
"calibrationPath": "path/to/calibration.json",
"mapPath": "path/to/map.png",
"rtlsLogPath": "path/to/mdx-rtls.log",
"videoDirPath": "path/to/folder/containing/videos",
"rawDataPath": "path/to/raw_data.log"
},
"output": {
"outputVideoPath": "path/to/output_video.mp4",
"outputMapHeight": 1080,
"displaySensorViews": false,
"sensorViewsLayout": "radial",
"sensorViewDisplayMode": "rotational",
"sensorFovDisplayMode": "rotational",
"skippedBeginningTimeSec": 0.0,
"outputVideoDurationSec": 60.0,
"sensorSetup": 8,
"bufferLengthThreshSec": 3.0,
"trajectoryLengthThreshSec": 5.0,
"sensorViewStartTimeSec": 2.0,
"sensorViewDurationSec": 1.0,
"sensorViewGapSec": 0.1
}
}
Categorization of Config Parameters
RTLS visualization configuration parameters are categorized as:
Input Parameters:
calibrationPath
,mapPath
,rtlsLogPath
,videoDirPath
, andrawDataPath
Output Parameters:
outputVideoPath
,outputMapHeight
,displaySensorViews
,sensorViewsLayout
,sensorViewDisplayMode
,sensorFovDisplayMode
,skippedBeginningTimeSec
,outputVideoDurationSec
,sensorSetup
,bufferLengthThreshSec
,trajectoryLengthThreshSec
,sensorViewStartTimeSec
,sensorViewDurationSec
, andsensorViewGapSec
Viz Config Details
Name |
Category |
Type |
Default |
Range |
Description |
---|---|---|---|---|---|
calibrationPath |
input |
str |
The path to the calibration file in JSON format. |
||
mapPath |
input |
str |
The path to the input map image for top-view visualization. |
||
rtlsLogPath |
input |
str |
The path to the RTLS log from the Kafka topic of |
||
videoDirPath |
input |
str |
The path to the directory of video files. |
||
rawDataPath |
input |
str |
The path to the raw data file (protobuf format by default). |
||
outputVideoPath |
output |
str |
The path to the output video file. |
||
outputMapHeight |
output |
int |
1080 |
> 0 |
The height in pixel for scaling the map image in the output video. |
displaySensorViews |
output |
bool |
False |
True or False |
If true, display the sensor views around the top-view visualization. |
sensorViewsLayout |
output |
str |
“radial” |
[“radial”, “split”] |
The sensor views’ layout used when |
sensorViewDisplayMode |
output |
str |
“rotational” |
[“rotational”, “cumulative”] |
The display mode for the sensor views used when |
sensorFovDisplayMode |
output |
str |
“rotational” |
[“rotational”, “cumulative”] |
The display mode for FOVs used when |
skippedBeginningTimeSec |
output |
float |
0.0 |
>= 0 |
The time in second to skip at the beginning in the output video. |
outputVideoDurationSec |
output |
float |
60.0 |
> 0 |
The duration of the output video in second. |
sensorSetup |
output |
int |
30 |
[8, 12, 16, 30, 40, 96, 100] |
The pre-defined setup according to the number of sensors, used when |
bufferLengthThreshSec |
output |
float |
3.0 |
> 0 |
The buffer length in second for smoothing the locations for visualization. |
trajectoryLengthThreshSec |
output |
float |
5.0 |
> 0 |
The trajectory length limit in second for for plotting the tails of locations. |
sensorViewStartTimeSec |
output |
float |
2.0 |
> 0 |
The starting time in second to display the sensor views in rotation, used when |
sensorViewDurationSec |
output |
float |
1.0 |
> 0 |
The duration in second to display the each sensor view in rotation, used when |
sensorViewGapSec |
output |
float |
0.1 |
> 0 |
The gap in second to display the sensor views in rotation, used when |
Calibration
Use the Calibration tool to generate the JSON. For more details, refer to the Camera Calibration. The calibration JSON structure is shown below:
{
"version": "1.0",
"osmURL": "",
"calibrationType": "cartesian",
"sensors": [
{
"type": "camera",
"id": "Retail_Synthetic_Cam01",
"origin": {
"lng": 0,
"lat": 0
},
"geoLocation": {
"lng": 0,
"lat": 0
},
"coordinates": {
"x": 27.752674116114072,
"y": 29.520047192178833
},
"scaleFactor": 29.3610391053046,
"attributes": [
{
"name": "fps",
"value": "30"
},
{
"name": "depth",
"value": ""
},
{
"name": "fieldOfView",
"value": ""
},
{
"name": "direction",
"value": "80.83911991119385"
},
{
"name": "source",
"value": "vst"
},
{
"name": "frameWidth",
"value": "1920"
},
{
"name": "frameHeight",
"value": "1080"
},
{
"name": "fieldOfViewPolygon",
"value": "POLYGON((37.30339706547104 23.53302611402356, 32.39147622088676 27.72174026541681, 23.888735595644498 26.37998053209448, 18.37785774763376 17.302857646758667, 15.06215357071955 1.9512522630593612, 38.851128391907295 1.7875684444191782, 37.30339706547104 23.53302611402356))"
}
],
"place": [
{
"name": "building",
"value": "Retail-Store"
}
],
"imageCoordinates": [
{
"x": 141.6635912555878,
"y": 269.48316499503875
},
{
"x": 240.98584878571785,
"y": 265.9477324692906
},
{
"x": 474.86495021238807,
"y": 256.8443141943412
},
{
"x": 560.3212274031938,
"y": 253.34065913954282
},
{
"x": 726.9749882013346,
"y": 246.47581521243487
},
{
"x": 806.7600319715643,
"y": 243.18033763144211
},
{
"x": 1049.1798543328828,
"y": 210.84647658915785
},
{
"x": 599.1893869353772,
"y": 158.40456743892264
},
{
"x": 826.3199319061305,
"y": 430.18982743871027
},
{
"x": 692.2412605037435,
"y": 439.5586563490481
},
{
"x": 293.8022700703916,
"y": 940.2497293827583
},
{
"x": 1402.657869447458,
"y": 809.4458238430801
},
{
"x": 1110.7821636347703,
"y": 413.2539924280829
},
{
"x": 929.7185023273412,
"y": 246.31588848277949
},
{
"x": 1014.7021052459154,
"y": 286.1249896426217
},
{
"x": 590.1431423343173,
"y": 334.4703629959116
},
{
"x": 415.03589063341894,
"y": 204.6531095428365
},
{
"x": 902.927686501944,
"y": 159.48107191410975
},
{
"x": 532.7209218694804,
"y": 828.0865212903391
},
{
"x": 1115.481527217435,
"y": 318.46536548174294
},
{
"x": 1475.2066751140385,
"y": 275.30376809549483
}
],
"globalCoordinates": [
{
"x": 16.979024629976095,
"y": 18.983219038249672
},
{
"x": 18.482709318763998,
"y": 18.968406614904072
},
{
"x": 22.166473566193257,
"y": 18.95933943696289
},
{
"x": 23.56924098641491,
"y": 18.95869975313936
},
{
"x": 26.396910248181758,
"y": 18.981927421337907
},
{
"x": 27.799359187521056,
"y": 18.99127020773497
},
{
"x": 32.715754693039486,
"y": 20.88230250608199
},
{
"x": 22.729803639723098,
"y": 28.773798981032822
},
{
"x": 27.812523744506358,
"y": 10.739385291736074
},
{
"x": 26.40710095590235,
"y": 10.745029820964021
},
{
"x": 24.85870713145265,
"y": 4.792837852944807
},
{
"x": 30.96384273807492,
"y": 4.728927314341821
},
{
"x": 30.964239137891095,
"y": 10.71346720864782
},
{
"x": 29.94779515259621,
"y": 18.42203428733013
},
{
"x": 30.927376677045483,
"y": 15.754878369497227
},
{
"x": 24.785512670974654,
"y": 14.430034306444753
},
{
"x": 19.81976862078707,
"y": 23.58767170645169
},
{
"x": 30.437280874103934,
"y": 27.3475721093038
},
{
"x": 25.900291031269614,
"y": 5.398659264709973
},
{
"x": 32.00392031859042,
"y": 13.957021804422336
},
{
"x": 38.26375971510251,
"y": 15.039096669496686
}
],
"tripwires": [],
"rois": []
}
]
}
A calibration comprises of arrays of sensors, where each sensor record consists of multiple attributes, the ones which are used by the pipeline are:
type : type of the sensor. For example, camera.
id : unique ID of the sensor.
origin comprises of:
origin.lng : Locations need to be in Cartesian coordinates often. A small area like a city can be considered planar and all locations of the city can be measured using Cartesian coordinates. A random or specific location of the city can be used as the origin. origin.lng represents the longitude of the origin.
origin.lat represents the latitude of the origin.
geoLocation : the geo-location of the sensor, consisting of [lng,lat].
coordinates : the location of the sensor in the Cartesian coordinates, consisting of [x,y].
translationToGlobalCoordinates : the translation vector to convert the locations to global coordinates for plotting on the map, consisting of [x,y].
scaleFactor : the scale factor of global coordinates from the unit of interest, e.g., meter, to pixel unit on the map image.
attributes : an array of name-value pairs consisting of:
fps : the video frame rate.
depth : the depth of the sensor.
fieldOfView : the field of view (FOV) of the sensor.
direction : the direction of the sensor.
source : the video source, e.g., VST.
frameWidth : the frame width of input video.
frameHeight : the frame height of input video.
fieldOfViewPolygon : the FOV polygon in WKT format.
place : an array of name-value pairs to represent a place, e.g., city=santa-clara/building=bldg_K/room=G.
imageCoordinates : the image coordinate locations to be mapped to globalCoordinates by the calibration tool. The mapping is used to generate the homography matrix.
globalCoordinates : see imageCoordinates mentioned above.
intrinsicMatrix : the 3-by-3 matrix of intrinsic camera parameters, such as focal lengths, focal center and scale factor.
extrinsicMatrix : the 3-by-3 matrix of extrinsic camera parameters for translation and rotation to convert from camera coordinates to 3D world coordinates.
cameraMatrix : the 3-by-4 camera matrix to convert the coordinates in the 3D world to the pixel locations in the sensor view.
homography : the 3-by-3 camera matrix to convert the coordinates on the 3D ground plane to the pixel locations in the sensor view.
tripwires : list of tripwires formed by arrays of points, which are usually drawn at the entrances of the doors to count the number of people going in or getting out.
rois : list of regions of interest formed by arrays of points.