Server#

The server consists of a Triton Inference Server (https://developer.nvidia.com/nvidia-triton-inference-server), the Maxine SDK model repository, and the Maxine SDK backend library.

The Triton-enabled AR and VFX SDKs use the following features in the Triton Inference Server to deliver higher throughput and multistream support.

The Maxine SDK backend library and the Maxine SDK model repository implement the Maxine feature on the Triton server. Maxine SDK model repository contains models and configuration files.

The server is supplied with default configuration files, which can be used as it is. We recommended that users do not edit the tensor names, type, shape, or the sequence batching and ensemble architecture in the configuration files. However, some of the parameters in the configuration files can be modified to optimize performance and to enable or disable certain features, as discussed later. Refer to https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/user_guide/model_configuration.html for more details about the Triton model configuration files.

Configuring AR SDK Server#

Each feature is implemented as a folder in the model repository. The following table lists the location for the configuration file for each feature.

Feature

Configuration File Location

Face Detection

FaceBox

Face Keypoints, 68 Keypoints, mode 0

FaceKeypoints68Mode0

Face Keypoints, 68 Keypoints, mode 1

FaceKeypoints68Mode1

Face Keypoints, 68 Keypoints, mode 0

FaceKeypoints126Mode0

Face Keypoints, 68 Keypoints, mode 1

FaceKeypoints126Mode1

Gaze Redirection, 68 Keypoints

GazeRedirectionKey68

Gaze Redirection, 126 Keypoints

GazeRedirectionKey126

VideoLivePortrait, Perf model, Mode1, without FrameSelection

VideoLPPerfMode1FS0

VideoLivePortrait, Perf model, Mode1, with FrameSelection

VideoLPPerfMode1FS1

VideoLivePortrait, Perf model, Mode2, without FrameSelection

VideoLPPerfMode2FS0

VideoLivePortrait, Perf model, Mode2, with FrameSelection

VideoLPPerfMode2FS1

VideoLivePortrait, Perf model, Mode3, without FrameSelection

VideoLPPerfMode3FS0

VideoLivePortrait, Perf model, Mode3, with FrameSelection

VideoLPPerfMode3FS1

VideoLivePortrait, Qual model, Mode1, without FrameSelection

VideoLPQualMode1FS0

VideoLivePortrait, Qual model, Mode1, with FrameSelection

VideoLPQualMode1FS1

VideoLivePortrait, Qual model, Mode2, without FrameSelection

VideoLPQualMode2FS0

VideoLivePortrait, Qual model, Mode2, with FrameSelection

VideoLPQualMode2FS1

VideoLivePortrait, Qual model, Mode3, without FrameSelection

VideoLPQualMode3FS0

VideoLivePortrait, Qual model, Mode3, with FrameSelection

VideoLPQualMode3FS1

LipSync

LipSync

The following parameters can be modified in the configuration file:

  • Maximum batch size

    The property max_batch_size in the configuration file sets the maximum size of the batch Triton uses with the dynamic batcher. We recommend that this parameter be set to a value equal to the expected number of active video streams.

  • Dynamic batching parameters

    The dynamic batching can be optimized by setting the following properties.

    • max_candidate_sequences: The maximum number of possible concurrent video streams.

    • max_queue_delay_microseconds: The amount of time, in microseconds, that the dynamic batcher will wait to complete the batch.

    • max_sequence_idle_microseconds: The amount of time, in microseconds, an idle input video stream is kept active.

  • Instance group

    The instance group property can be used to create multiple instances of the feature on the Triton, either on the same GPU or on multiple GPUs. Refer to triton-inference-server/server for more details. Note that the kind field should always be set to GPU.

Configuring VFX SDK Server#

The AI Green Screen is implemented as a Triton ensemble (triton-inference-server/server). In the model repository, the folders AigsEnsembleMode0 and AigsEnsembleMode1 have the ensemble for AI Green Screen mode 0 and mode 1. The corresponding models and configuration files for mode 0 and mode 1 are in AigsStatefulModelMode0 and AigsStatefulModelMode1.

The following parameters may be modified in the configuration file in AigsStatefulModelMode0 and AigsStatefulModelMode1 folders:

  • Dynamic batching parameters

    The dynamic batching can be optimized by using the following properties.

    • max_candidate_sequences: it sets the maximum number of possible concurrent video streams

    • max_queue_delay_microseconds: the amount of time the dynamic batcher will wait to complete the batch in microseconds

    • max_sequence_idle_microseconds: it sets the time in microseconds an idle input video stream is kept active

  • Instance group

    The instance group property can be used to create multiple instances of the feature on the Triton, either on the same GPU or on multiple GPUs. Refer to triton-inference-server/server for more details. Note that the kind field should always be set to GPU.