Troubleshooting#

Lag and animation or audio stutters can be caused by various factors. This section lists some common issues, and how to narrow down the root cause.

Audio2Face-3D Microservice Too Slow#

In some circumstances, the animation data stream from the Audio2Face-3D microservice is streaming data slower than real-time (e.g. slower than 30.0 frames per second). In that case, the Animation Graph microservice doesn’t play back all the animation data and simply drops the animation data frames that arrive too late. When this happens, the following message is printed in the Animation Graph microservice logs:

Warning: buffer underrun. Discarding current data

When this happens a lot, and the root cause is that the Audio2Face-3D microservice proves data too slowly or very irregularly, then this can be mitigated somewhat by increasing the value of the animationSource.bufferSize UCS microservice parameter of the Animation Graph microservice. The higher this value, the more animation data will be buffered before playback. This makes the connection more robust to jitter and delayed animation data, but at the cost of higher latency. We found a value of 0.1s to be acceptable. Note that this value depends on your system’s configuration and performance characteristics.

Audio2Face-3D Microservice Does Not Send Animation Data#

In rare scenarios, we observed that the Audio2Face-3D microservice might not be sending any animation data chunks nor the animation data stream header. When this happens, no animation data is played back, and the following message is printed in the logs:

Received SUCCESS status before the header in the animation data stream!

We have only experienced this issue in experimental scenarios. If this happens, consider investigating whether the Audio2Face-3D microservice is running correctly and consider upgrading if possible.

Animation Graph Microservice Too Slow#

Sometimes, the animation data from the Animation Graph microservice arrives at the Omniverse Renderer microservice slower than in real-time. When this happens, the following message is printed in the Omniverse Renderer microservice logs:

Display time is in the past! Skipping frame!

This usually means that Animation Graph microservice was slowed down for a short period, e.g. when the CPU is busy handling another process. Here again, if the issue occurs a lot, you may mitigate this by increasing the value of the animationSource.bufferSize UCS parameter of the Omniverse Renderer microservice. The higher this value, the bigger the animation data input buffer. This reduces the risk of animation data stream jitter, but comes at the cost of higher latency. We found a value of 0.1s to be acceptable. Note that this value depends on your system’s configuration and performance characteristics.

Speech Audio and Lip Animation Synchronization#

It may happen that there are no warnings in the animation data streams, and the animation data playback is smooth, however, the speech audio and the lip animation are not in sync.

The root cause is that the Livestream extensions sends audio and video through two different RTP streams. These streams don’t support synchronization protocols, which may lead to a perceptible temporal offset between the animation and the audio stream.

To solve this, the Omniverse Renderer microservice has a livestream.audioDelay UCS parameter that delays audio by a the specified number of seconds. We have found that a value of about 0.1s resolved the issue on systems.

Performance Monitoring#

If you observe lags or interruptions that are not caused by any issue documented above, the next step is to inspect the frame rate of the Animation Graph and the Omniverse Renderer microservices.

The frame rate for the Animation Graph microservice is printed in the logs:

Output animation data | Stream ID: <stream_id> | Mean frame time: 0.0334

And for the Omniverse Renderer microservice:

Rendering animation data | Time since start [s]: 1314.159 | Port: 8000 | Mean frame time: 0.0334

These values are usually very similar.

If the Mean frame time of Animation Graph microservice and/or the Omniverse Renderer microservice are frequently above ~0.034 seconds, then it’s usually related to a general performance issue. This is often caused by the GPU reaching 100% usage and slowing down the whole system.

Avatar Gesture Is Not Triggered#

If you trigger an invalid animation gesture or posture state, then the HTTP call will succeed without warning or error. However, the avatar goes into a neutral stance and extends one or more fingers of the left hand. That’s the so called test pose. This is an indicator that the Animation Graph has received an invalid or no input. E.g. if the string for a gesture was misspelled, or if the character wasn’t found by the micro service. You can recover from this state by triggering a new gesture and a new posture with a valid value.

This image shows the test pose the avatar will go into when an invalid gesture or posture is triggered.

This image shows the test pose the avatar will go into when an invalid gesture or posture is triggered.#

Glfw Warnings#

When running the Animation Pipeline with docker, you may encounter glfw warnings in the Animation Graph Microservices logs, like below. When those happen, you are unable to see the scene. If using Unreal Renderer, the UI is stuck with the message “WebRTC Connection Negotiated”.

2024-05-24 16:41:09 [1,094ms] [Warning] [carb.windowing-glfw.plugin] GLFW initialization failed.
2024-05-24 16:41:09 [1,094ms] [Warning] [carb] Failed to startup plugin carb.windowing-glfw.plugin (interfaces: [carb::windowing::IGLContext v1.0],[carb::windowing::IWindowing v1.4]) (impl: carb.windowing-glfw.plugin)

To address this issue, run the following:

sudo xhost +

And re-run the Animation Graph container with the following extra parameters before the name of the image: -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY

Not Enough Memory#

If docker containers or kubernetes deployments fail to start, you may be running out of memory. In kubernetes, this manifests as a pod stuck in a CrashloopBackoff state. In the container’s logs, you see lines resembling:

[A2F SDK] [ERROR] [TensorRT] 1: [defaultAllocator.cpp::allocate::20] Error Code 1: Cuda Runtime (out of memory)
[A2F SDK] [ERROR] [TensorRT] 2: [executionContext.cpp::ExecutionContext::410] Error Code 2: OutOfMemory (no further information)
[A2F SDK] [ERROR] Unable to create TensorRT Execution Context
[A2F SDK] [ERROR] Unable to initialize inference engine
[A2F SDK] [ERROR] SetNetwork Processor failed
[A2F SDK] [ERROR] Cannot Initialize from Json file: /opt/nvidia/a2f_pipeline/a2f_data/data/networks/claire_v1.3/a2f_ms_config.json

Permission Denied For EpicGames Docker Images#

You may encounter a Permission denied message when trying to run the Pixel Streaming Signalling Server. If this happens, ensure that:

  1. Your GitHub Personal Access Token has permissions for read:project and read:packages

  2. You are a member of the EpicGames GitHub organization in your organizations. You may have to manually accept the invitation to join the organization

Unreal Renderer Microservice Video Jitters#

In some machines and with certain quality settings (often lower quality settings), we noticed occasionally some jitter in the streamed video. The underlying issues has not been root caused yet, but as a workaround, we recommend increasing slightly the GPU load by changing quality related settings in the project. For example, increasing the resolution, increasing the Scalability quality setting, increasing the Fixed Frame Rate, or changing the Pixel Streaming video codec from H264 to V9 help to mitigate the jitters.