Troubleshooting¶
If you run into to trouble while using DeepStream, consider the following solutions. if you don’t find answers below, post your questions on DeepStream developer forum
You are migrating from DeepStream 5.x to DeepStream 6.0¶
Solution:
You must clean up the DeepStream 5.x libraries and binaries. Use one of these commands to clean up: * For dGPU:
To remove DeepStream 5.x (5.1 for example):
Open the uninstall.sh file in
/opt/nvidia/deepstream/deepstream/
Set
PREV_DS_VER
as 5.1Run the script as sudo
./uninstall.sh
For Jetson: Flash the target device with the latest release of JetPack.
“NvDsBatchMeta not found for input buffer” error while running DeepStream pipeline¶
Solution:
The Gst-nvstreammux plugin is not in the pipeline. Starting with DeepStream 4.0, Gst-nvstreammux is a required plugin. This is an example pipeline:
Gst nvv4l2decoder --> Gst nvstreammux --> Gst nvinfer --> Gst nvtracker --> Gst nvmultistreamtiler --> Gst nvvideoconvert --> Gst nvosd --> Gst nveglglessink
The DeepStream reference application fails to launch, or any plugin fails to load¶
Solution:
Try clearing the GStreamer cache by running the command:
$ rm -rf ${HOME}/.cache/gstreamer-1.0
Also run this command if there is an issue with loading any of the plugins. Warnings or errors for failing plugins are displayed on the terminal.
$ gst-inspect-1.0
Then run this command to find missing dependencies:
$ ldd <plugin>.so
where <plugin> is the name of the plugin that failed to load.
Application fails to run when the neural network is changed¶
Solution:
Make sure that the network parameters are updated for the corresponding [GIE] group in the configuration file (e.g. source30_720p_dec_infer-resnet_tiled_display_int8.txt
). Also make sure that the Gst-nvinfer plugin’s configuration file is updated accordingly.
When the model is changed, make sure that the application is not using old engine files.
The DeepStream application is running slowly (Jetson only)¶
Solution:
Ensure that Jetson clocks are set high. Run these commands to set Jetson clocks high.
$ sudo nvpmodel -m <mode> --for MAX perf and power mode is 0
$ sudo jetson_clocks
The DeepStream application is running slowly¶
Solution 1:
One of the plugins in the pipeline may be running slowly. You can measure the latency of each plugin in the pipeline to determine if one of them is slow. * To enable frame latency measurement, run this command on the console:
$ export NVDS_ENABLE_LATENCY_MEASUREMENT=1
To enable latency for all plugins, run this command on the console:
$ export NVDS_ENABLE_COMPONENT_LATENCY_MEASUREMENT=1
Note
When measuring frame latency using DeepStream latency APIs if large frame latency numbers in the order of 10^12
or 1e12
are observed, modify the latency measurement code (call to nvds_measure_buffer_latency
API) to
...
guint num_sources_in_batch = nvds_measure_buffer_latency(buf, latency_info);
if (num_sources_in_batch > 0 && latency_info[0].latency > 1e6) {
NvDsBatchMeta *batch_meta = gst_buffer_get_nvds_batch_meta (buf);
batch_meta->batch_user_meta_list = g_list_reverse (batch_meta->batch_user_meta_list);
num_sources_in_batch = nvds_measure_buffer_latency(buf, latency_info);
}
...
Solution 2: (dGPU only)
Ensure that your GPU card is in the PCI slot with the highest bus width.
Solution 3:
In the configuration file’s [streammux]
group, set batched-push-timeout
to 1/max_fps
.
Solution 4:
In the configuration file’s [streammux]
group, set width and height to the stream’s resolution.
Solution 5:
For RTSP streaming input, in the configuration file’s [streammux]
group, set live-source=1
. Also make sure that all [sink#] groups have the sync property set to 0.
Solution 6:
If secondary inferencing is enabled, try to increase batch-size in the the configuration file’s [secondary-gie#]
group in case the number of objects to be inferred is greater than the batch-size setting.
Solution 7:
On Jetson, use Gst-nvoverlaysink
instead of Gst-nveglglessink
as nveglglessink
requires GPU utilization.
Solution 8:
If the GPU is the performance bottleneck, try increasing the interval at which the primary detector infers on input frames. You can do this by modifying the interval property of [primary-gie]
group in the application configuration, or the interval property of the Gst-nvinfer configuration file.
Solution 9:
If the elements in the pipeline are getting starved for buffers (you can check if CPU/GPU utilization is low), increase the number of buffers allocated by the decoder by setting the num-extra-surfaces
property of the [source#] group in the application or the num-extra-surfaces
property of Gst-nvv4l2decoder
element.
Solution 10:
If you are running the application inside docker/on-console and it delivers low FPS, set qos=0
in the configuration file’s [sink0] group.
The issue is caused by initial load. With qos
set to 1 as the property’s default value in the [sink0] group, decodebin starts dropping frames.
Solution 11:
For RTSP streaming input, if the input has high jitter the GStreamer rtpjitterbuffer
element might drop packets which are late. Increase the latency property of rtspsrc
, for deepstream-app
set latency
in [source*] group. Alternatively, if using RTSP type source (type=4) with deepstream-app
, turn off drop-on-latency
in deepstream_source_bin.c
. These steps may add cumulative delay in frames reaching the renderer and memory accumulation in the rtpjitterbuffer
if the pipeline is not fast enough.
Solution 12:
On Jetson in the configuration file of gst-nvinfer set scaling-compute-hw = 1
if gpu usage is not 100%.
NVIDIA Jetson Nano™, deepstream-segmentation-test starts as expected, but crashes after a few minutes rebooting the system¶
Solution:
We recommend you power the Jetson module through the DC power connector when running this app. USB adapters may not be able to handle the transients.
Errors occur when deepstream-app is run with a number of streams greater than 100¶
For example: (deepstream-app:15751): GStreamer-CRITICAL **: 19:25:29.810: gst_poll_write_control: assertion 'set != NULL' failed.
Solution:
Run this command on the console:
ulimit -Sn 4096
Then run the deepstream-app
again.
Errors occur when deepstream-app fails to load plugin Gst-nvinferserver¶
For example: (deepstream-app:16632): GStreamer-WARNING **: 13:13:31.201: Failed to load plugin '/usr/lib/x86_64-linux-gnu/gstreamer-1.0/deepstream/libnvdsgst_inferserver.so':
libtrtserver.so: cannot open shared object file: No such file or directory.
This is a harmless warning indicating that the DeepStream’s nvinferserver
plugin cannot be used since “Triton Inference Server” is not installed on x86(dGPU) platforms. Jetson platforms should not have this problem since Triton is installed automatically by DeepStream package.
Solution 1:
Ignore this message if Users do not need Triton support. Otherwise see Solution 2, 3.
Solution 2:
Pull deepstream-triton
docker image and start the container. Retry deepstream-app
to launch triton models.
Solution 3:
Build Triton server library from source (https://github.com/triton-inference-server/server/releases/tag/v2.5.0) and fix dynamic link problem manually.
Tensorflow models are running into OOM (Out-Of-Memory) problem¶
This problem may manifest as other errors like CUDA_ERROR_OUT_OF_MEMORY
, core dump
, application get killed
once GPU memory is set up by tensorflow component.
Solution:
Tune parameter tf_gpu_memory_fraction
in config file (e.g. config_infer_primary_detector_ssd_inception_v2_coco_2018_01_28.txt
) to a proper value. For more details, see:
samples/configs/deepstream-app-triton/README
Memory usage keeps on increasing when the source is a long duration containerized files(e.g. mp4, mkv)¶
A memory accumulation bug is present in GStreamer’s Base Parse class which potentially affects all codec parsers provided by GStreamer. This bug is seen only with long duration seekable streams (mostly containerized files e.g. mp4). This does not affect live sources like RTSP. An issue has been filed on GStreamer’s gitlab project https://gitlab.freedesktop.org/gstreamer/gstreamer/-/issues/468
Solution:
Apply the following temporary fix to the GStreamer sources and build the library.
Check the exact gstreamer version installed on the system.
$ gst-inspect-1.0 --version gst-inspect-1.0 version 1.14.5 GStreamer 1.14.5 https://launchpad.net/distros/ubuntu/+source/gstreamer1.0
Clone the Gstreamer repo and checkout the tag corresponding to the installed version.
$ git clone git@gitlab.freedesktop.org:gstreamer/gstreamer.git $ cd gstreamer $ git checkout 1.14.5
Make sure the build dependencies are installed.
$ sudo apt install libbison-dev build-essential flex debhelper
Run
autogen.sh
and configure script.$ ./autogen.sh –noconfigure $ ./configure –prefix=(pwd)/out # Don’t want to overwrite system libs
Save the following patch to a file.
diff --git a/libs/gst/base/gstbaseparse.c b/libs/gst/base/gstbaseparse.c index 41adf130e..ffc662a45 100644 --- a/libs/gst/base/gstbaseparse.c +++ b/libs/gst/base/gstbaseparse.c @@ -1906,6 +1906,9 @@ gst_base_parse_add_index_entry (GstBaseParse * parse, guint64 offset, GST_LOG_OBJECT (parse, "Adding key=%d index entry %" GST_TIME_FORMAT " @ offset 0x%08" G_GINT64_MODIFIER "x", key, GST_TIME_ARGS (ts), offset); + if (!key) + goto exit; + if (G_LIKELY (!force)) { if (!parse->priv->upstream_seekable) {
Apply the patch.
$ cat patch.txt | patch -p1
Build the sources.
make -j(nproc) && make install
8. Backup the distribution provided library and copy the newly built library. Adjust the library name for version. For jetson replace x86_64-linux-gnu with aarch64-linux-gnu.
$ sudo cp /usr/lib/x86_64-linux-gnu/libgstbase-1.0.so.0.1405.0 ${HOME}/libgstbase-1.0.so.0.1405.0.backup
$ sudo cp out/lib/libgstbase-1.0.so.0.1405.0 /usr/lib/x86_64-linux-gnu/
Stale frames observed on RTSP output¶
If stale frames are observed on the RTSP output, then update rtsp-port and udp-port parameter( at RTSP sink) inside the config file which is being used to run deepstream applications.
Update rtsp-port to other port number. For ex: rtsp-port=8660
Update udp-port to other port number. For ex: udp-port=5500
Troubleshooting in NvDCF Parameter Tuning¶
Flickering Bbox¶
In case the PGIE detection interval is set to zero (i.e., interval=0
in the ds-app config file), the bbox flickering may occur in the video output if the value for minTrackerConfidence
is set too low. Try increasing the value for this parameter to mitigate the issue.
In case the PGIE detection interval is set to be a non-zero value (i.e., interval
> 0 in the ds-app config file), it is expected that the tracker outputs are not reported on the uninferenced frames, although all the targets are being tracked in the background. Thus, it is an expected behavior that the real-time video display from OSD has the bbox flickering.
To mitigate this issue, users can first enable the past-frame data configuration to retrieve the missed outputs and then add a custom module to combine the real-time metadata with the past-frame data. By doing so, users can combine the data in a proper order and can opt to visualize the combined data on the display with no flickering bbox issue.
Frequent tracking ID changes although no nearby objects¶
This may occur because the tracker cannot detect the target from the correlation response map. It is recommended to start with lower minimum qualification for the target. First, set minTrackerConfidence
with a relatively low value like 0.5
. Also, in case the state estimator is enabled, the prediction may not be accurate enough. Users may tune the state estimator parameters based on the expected motion dynamics, or disable during debugging.
Frequent tracking ID switches to the nearby objects¶
Make the data association policy stricter by increasing the minimum qualifications such as:
minMatchingScore4SizeSimilarity
minMatchingScore4Iou
minMatchingScore4VisualSimilarity
Note
For more FAQs and troubleshooting, see https://forums.developer.nvidia.com/t/deepstream-sdk-faq/
Error while running ONNX / Explicit batch dimension networks¶
After upgrading TensorRT on Jetson, running with ONNX / Explicit batch dimension networks fails with the error “Network has dynamic or shape inputs, but no optimization profile has been defined.”
Due to an ABI break in TensorRT 7.1.3.0 (part of Jetpack 4.4 - Jetpack 4.6), when moving to newer TensorRT versions, libnvds_infer.so needs to be recompiled from sources provided in the SDK. The sources along with the compilation instructions can be found in /opt/nvidia/deepstream/deepstream-6.0/sources/libs/nvdsinfer
DeepStream plugins failing to load without DISPLAY variable set when launching DS dockers¶
Solution:
The error “No EGL Display; nvbufsurftransform: Could not get EGL display connection” will be resolved if the user ensure to meet either of the below requirements.
The below requirements shall be met before starting the docker using docker run
command.
Nvidia driver installation issues¶
Error: “An NVIDIA kernel module ‘nvidia-drm’ appears to already be loaded in your kernel. ****”
Solution:
The Error suggest another version of NVIDIA driver is loaded. You may refer to the links below to uninstall previous driver versions. Uninstallation method differs based on the installation method (runfile or debian).
Quick Links:
1. Handle Conflicting Installation Methods.
1. [When user expect to use Display window]¶
Set appropriate value for the
DISPLAY
variable andExecute the command:
xhost +
from the host terminal, to allow the docker to launch a display window.
Example:
$ export DISPLAY=:0
$ xhost +
2. [When user expect to not use a Display window]¶
unset the
DISPLAY
variable orlaunch the docker without exporting
DISPLAY
variable to its environment.
Nvidia TensorRT installation issues¶
Error: packages have unmet dependencies
Solution:
sudo apt install libnvinfer-plugin-dev=8.0.1-1+cuda11.4 libnvparsers-dev=8.0.1-1+cuda11.4 libnvonnxparsers-dev=8.0.1-1+cuda11.4 libnvinfer-dev=8.0.1-1+cuda11.4
libnvinfer8=8.0.1-1+cuda11.4 libnvonnxparsers8=8.0.1-1+cuda11.4 libnvinfer-plugin8=8.0.1-1+cuda11.4 libnvparsers8=8.0.1-1+cuda11.4
Graph Composer Troubleshooting¶
My component is not visible in the composer even after registering the extension with registry¶
Run “registry extn info -n <extn-name>” to check that component is part of the extension.
If not, Check that a call to GXF_EXT_FACTORY_ADD() corresponding to the component has been added within the GXF_EXT_FACTORY code block of the extension
Run “registry comp info -t <comp-type>” to check if the component is being detected as an abstract type.
If yes, component does not list abstract types since they cannot be instantiated. Check the solution to the next question.
My component is getting registered as an abstract type.¶
This usually happens because pure virtual methods inherited from the base class hierarchy have not been implemented. A quick way to find which pure virtual methods have not been implemented is to try to declare an object of the component class. Adding “MyComponentType comp;” anywhere after the class definition should throw compilation errors pointing to the missing implementations.
When executing a graph, the execution ends immediately with the warning “No system specified. Nothing to do”¶
Check that NvDsScheduler component is part of the graph