Properties for the AR SDK Features#
This section provides the properties and their values for the features in the AR SDK.
Face Tracking Property Values#
The following tables list the values for the configuration, input, and output properties for face tracking.
Table 3‑2: Configuration Properties for Face Tracking
Property Name |
Value |
|---|---|
FeatureDescription |
String is free-form text that describes the feature. The string is set by the SDK and cannot be modified by the user. |
CUDAStream |
The CUDA stream, which is set by the user. |
ModelDir |
String that contains the path to the folder that contains the TensorRT package files. Set by the user. |
Temporal |
Unsigned integer to enable (1) or disable (0) the temporal optimization of face detection. If enabled, only one face is returned. For more information, refer to Face Detection and Tracking. Set by the user. |
Table 3‑3: Input Properties for Face Tracking
Property Name |
Value |
|---|---|
Image |
Interleaved (or chunky) 8-bit BGR input
image in a CUDA buffer of type To be allocated and set by the user. |
Table 3‑4: Output Properties for Face Tracking
Property Name |
Value |
|---|---|
BoundingBoxes |
To be allocated by the user. |
BoundingBoxesConfidence |
Optional: An array of single-precision (32-bit) floating-point numbers that contain the confidence values for each detected face box. To be allocated by the user. |
Landmark Tracking Property Values#
The following tables list the values for the configuration, input, and output properties for landmark tracking.
Table 3‑5: Configuration Properties for Landmark Tracking
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. |
CUDAStream |
The CUDA stream. Set by the user. |
ModelDir |
String that contains the path to the folder that contains the TensorRT package files. Set by the user. |
BatchSize |
The number of inferences to be run at one
time on the GPU.
The maximum value is 8.
Temporal optimization of landmark detection
is supported only for |
Landmarks_Size |
Unsigned integer, 68 or 126. Specifies the number of landmark points (x and y values) to be returned. Set by the user. |
LandmarksConfidence_Size |
Unsigned integer, 68 or 126. Specifies the number of landmark confidence values for the detected keypoints to be returned. Set by the user. |
Temporal |
Unsigned integer to enable (1) or disable (0) the temporal optimization of landmark detection. If enabled, only one input bounding box is supported as the input. For more information, refer to Face Detection and Tracking. Set by the user. |
Mode |
Optional: Unsigned integer. Set 0 to enable Performance mode (default) or 1 to enable Quality mode for landmark detection. Set by the user. |
Table 3‑6: Input Properties for Landmark Tracking
Property Name |
Value |
|---|---|
Image |
Interleaved (or chunky) 8-bit BGR input
image in a CUDA buffer of type To be allocated and set by the user. |
BoundingBoxes |
Optional: If not specified as an input property, face detection is automatically run on the input image. For more information, refer to Face Detection and Tracking. To be allocated by the user. |
Table 3‑7: Output Properties for Landmark Tracking
Property Name |
Value |
|---|---|
Landmarks |
To be allocated by the user. |
Pose |
Optional: The OpenGL standards coordinate convention is used: When you look up from a camera, the coordinates are x (camera right), y (camera up), and z (toward camera). To be allocated by the user. |
LandmarksConfidence |
Optional: An array of single-precision (32-bit) floating-point numbers, which must be large enough to hold the number of confidence values given by the product of the following:
To be allocated by the user. |
BoundingBoxes |
Optional: To be allocated by the user. |
Face 3D Mesh Tracking Property Values#
The following tables list the values for the configuration, input, and output properties for Face 3D Mesh tracking.
Table 3‑8: Configuration Properties for Face 3D Mesh Tracking
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. Set by the user. |
CUDAStream |
Optional: The CUDA stream. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. Set by the user. |
Temporal |
Optional: Unsigned integer to enable (1) or disable (0) the temporal optimization of face and landmark detection. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. Set by the user. |
Mode |
Optional: Unsigned integer. Set 0 to enable Performance mode (default) or 1 to enable Quality mode for landmark detection. Set by the user. |
Landmarks_Size |
Unsigned integer, 68 or 126. If landmark detection is run internally, the confidence values for the detected key points are returned. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. |
ShapeEigenValueCount |
The number of eigenvalues that describe the identity shape. Query this to determine how big the eigenvalue array should be, if that is a desired output. This property is read-only. |
ExpressionCount |
The number of expressions available in the chosen model. Query this to determine how big the expression coefficient array should be, if that is the desired output. This property is read-only. |
VertexCount |
The number of vertices in the chosen model. Query this property to determine how
big the vertex array should be, where
This property is read-only. |
TriangleCount |
The number of triangles in the chosen model. Query this property to determine how
big the triangle array should be, where
This property is read-only. |
GazeMode |
Flag to toggle gaze mode. The default value is 0. If the value is 1, gaze estimation is explicit. |
Table 3‑9: Input Properties for Face 3D Mesh Tracking
Property Name |
Value |
|---|---|
Width |
The width of the input image buffer that contains the face to which the face model will be fitted. Set by the user. |
Height |
The height of the input image buffer that contains the face to which the face model will be fitted. Set by the user. |
Landmarks |
Optional: An If landmarks are not provided to this feature, an input image must be provided. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
Image |
Optional: An interleaved (or chunky)
8-bit BGR input image in a CUDA buffer of
type If an input image is not provided as input, the landmark points must be provided to this feature as input. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
Table 3‑10: Output Properties for Face 3D Mesh Tracking
Property Name |
Value |
|---|---|
FaceMesh |
To be allocated by the user. Query |
RenderingParams |
To be allocated by the user. |
Landmarks |
Optional: An For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
Pose |
Optional: The OpenGL standards coordinate convention is used: When you look up from a camera, the coordinates are x (camera right), y (camera up), and z (toward camera). To be allocated by the user. |
LandmarksConfidence |
Optional: An array of single-precision (32-bit)
floating-point numbers, which must be large enough
to hold the number of confidence values of size
For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
BoundingBoxes |
Optional: To be allocated by the user. |
BoundingBoxesConfidence |
Optional: An array of single-precision (32-bit) floating-point numbers that contain the confidence values for each detected face box. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
ShapeEigenValues |
Optional: The array into which the shape
eigenvalues will be placed, if desired. Query
To be allocated by the user. |
ExpressionCoefficients |
Optional: The array into which the expression
coefficients will be placed, if desired. Query
To be allocated by the user. The corresponding expression shapes for
|
Eye Contact Property Values#
The following tables list the values for the configuration, input, and output properties for gaze redirection.
Table 3‑11: Configuration Properties for Eye Contact
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. |
ModelDir |
String that contains the path to the folder that contains the TensorRT package files. Set by the user. |
BatchSize |
The number of inferences to be run at one time on the GPU. The maximum value is 1. |
Landmarks_Size |
Unsigned integer, either 68 or 126. Specifies the number of landmark points (x and y values) to be returned. Set by the user. |
LandmarksConfidence_Size |
Unsigned integer, either 68 or 126. Specifies the number of landmark confidence values for the detected keypoints to be returned. Set by the user. |
GazeRedirect |
Flag to enable or disable gaze redirection. When enabled, the gaze is estimated, and the redirected image is set as the output. When disabled, the gaze is estimated but redirection does not occur. |
Temporal |
Unsigned integer to enable (1) or disable (0) the temporal optimization of landmark detection. Set by the user. |
DetectClosure |
Flag to toggle the detection of eye closure and occlusion. The default value is On. |
EyeSizeSensitivity |
An unsigned integer in the range 2–5, inclusive, that is used to increase the sensitivity of the algorithm to the redirected eye size. A value of 2 uses a smaller eye region, and a value of 5 uses a larger eye size. |
UseCudaGraph |
Bool. Default is False. Flag to use CUDA Graphs for optimization. Set by the user. |
EnableLookAway |
Bool. Default is false. Flag that, when set to true, redirects the eyes to look away at a random time for a random period. The eyes follow the relative changes in estimated gaze during the lookaway period. Set by the user. |
LookAwayOffsetMax |
Unsigned int value in the range 0–10. Default is 5. If the value is set to x degrees, a randomly chosen offset angle in the range −*x* to x in degrees will be added to the lookaway angle during the random lookaway period. The lookaway angle is based on the relative motion of the eyes in the input image during the lookaway period. It is not used outside the lookaway period. Set by the user. |
LookAwayIntervalMin |
Unsigned int value in the range 1–600. Default is 100. Minimum limit for the number of frames at which
random look away occurs. This value is
applicable only when The value can be optionally set by the user. |
LookAwayIntervalRange |
Unsigned int value in the range 1–600. Default is 250. Interval range for picking the number of frames
at which random lookaway occurs. Adding this
range to The value can be optionally set by the user. |
GazePitchThresholdLow |
Float value in the range of 10.0–35.0 (degrees). Default is 20.0. This is a range control parameter. It defines the threshold for estimated gaze angle in the pitch direction within which gaze is always redirected towards the camera. Beyond this angle, the redirected gaze transitions away from the camera and towards the estimated gaze angle. This value is optionally set by the user. |
GazeYawThresholdLow |
Float value in the range of 10.0–35.0 (degrees). Default is 20.0. This is a range control parameter. It defines the threshold for estimated gaze angle in the yaw direction within which gaze is always redirected towards the camera. Beyond this angle, the redirected gaze transitions away from the camera and towards the estimated gaze angle. This value is optionally set by the user. |
HeadPitchThresholdLow |
Float value in the range of 10.0–35.0 (degrees). Default is 15.0. This is a range control parameter. It defines the threshold for estimated head pose angle in the pitch direction within which gaze is always redirected towards the camera. Beyond this angle, the redirected gaze transitions away from the camera and towards the estimated gaze angle. This value is optionally set by the user. |
HeadYawThresholdLow |
Float value in the range of 10.0–35.0 (degrees). Default is 25.0. This is a range control parameter. It defines the threshold for estimated head pose angle in the yaw direction within which gaze is always redirected towards the camera. Beyond this angle, the redirected gaze transitions away from the camera and towards the estimated gaze angle. This value is optionally set by the user. |
GazePitchThresholdHigh |
Float value in the range of 10.0–35.0 (degrees). Default is 30.0. This is a range control parameter. It defines
the threshold for estimated gaze angle in the
pitch direction beyond which no redirection
occurs and the angle of redirected gaze is equal
to the estimated gaze. The redirected gaze in
the pitch direction increasingly moves away from
the camera and towards the estimated gaze beyond
This value is optionally set by the user. |
GazeYawThresholdHigh |
Float value in the range of 10.0–35.0 (degrees). Default is 30.0. This is a range control parameter. It defines
the threshold for estimated gaze angle in the
yaw direction beyond which no redirection occurs
and the angle of redirected gaze is equal to the
estimated gaze. The redirected gaze in the yaw
direction increasingly moves away from the
camera and towards the estimated gaze beyond
This value is optionally set by the user. |
HeadPitchThresholdHigh |
Float value in the range of 10.0-35.0 (degrees). Default value 25.0. This is a range control parameter. It defines
the threshold for estimated head pose angle in
the pitch direction beyond which no redirection
occurs and the angle of redirected gaze is equal
to the estimated gaze. The redirected gaze in
the pitch direction increasingly moves away from
the camera and towards the estimated gaze beyond
This value is optionally set by the user. |
HeadYawThresholdHigh |
Float value in the range of 10.0-35.0 (degrees). Default value 30.0. This is a range control parameter. It defines
the threshold for estimated head pose angle in
the yaw direction beyond which no redirection
occurs and the angle of redirected gaze is equal
to the estimated gaze. The redirected gaze in
the yaw direction increasingly moves away from
the camera and towards the estimated gaze beyond
This value is optionally set by the user. |
Table 3‑12: Input Properties for Eye Contact
Property Name |
Value |
|---|---|
Image |
Interleaved (or chunky) 8-bit BGR input image in
a CUDA buffer of type To be allocated and set by the user. |
Width |
The width of the input image buffer that contains the face to which the face model will be fitted. Set by the user. |
Height |
The height of the input image buffer that contains the face to which the face model will be fitted. Set by the user. |
Landmarks |
Optional: An If landmarks are not provided to this feature, an input image must be provided. For more information, refer to Alternative Usage of the Face 3D Mesh Feature. To be allocated by the user. |
Table 3‑13: Output Properties for Eye Contact
Property Name |
Value |
|---|---|
Landmarks |
To be allocated by the user. |
HeadPose |
Optional: The OpenGL standards coordinate convention is used: When you look up from a camera, the coordinates are x (camera right), y (camera up), and z (toward the camera). To be allocated by the user. |
LandmarksConfidence |
Optional: An array of single-precision (32-bit) floating-point numbers, which must be large enough to hold the number of confidence values given by the product of the following:
To be allocated by the user. |
BoundingBoxes |
Optional: To be allocated by the user. |
OutputGazeVector |
Float array, which must be large enough to hold the
two values (pitch and yaw) for the gaze angle in
radians per image. For batch sizes larger than 1, it
should hold To be allocated by the user. |
OutputHeadTranslation |
Optional: Float array, which must be large enough to
hold the head translations (x,y,z) per image. For
batch sizes larger than 1, it should hold
To be allocated by the user. |
GazeDirection |
Optional: Each element contains two To be allocated by the user. |
Body Detection Property Values#
The following tables list the values for the configuration, input, and output properties for Body Detection racking.
Table 3‑14: Configuration Properties for Body Detection
Property Name |
Value |
|---|---|
FeatureDescription |
String is free-form text that describes the feature. The string is set by the SDK and cannot be modified by the user. |
CUDAStream |
The CUDA stream, which is set by the user. |
ModelDir |
String that contains the path to the folder that contains the TensorRT package files. Set by the user. |
Temporal |
Unsigned integer to enable (1) or disable (0) the temporal optimization of body detection. Set by the user. |
FullBodyOnly |
Unsigned integer to select the estimation mode:
Set by the user. |
Table 3‑15: Input Properties for Body Detection
Property Name |
Value |
|---|---|
Image |
Interleaved (or chunky) 8-bit BGR input
image in a CUDA buffer of type
To be allocated and set by the user. |
Table 3‑16: Output Properties for Body Detection
Property Name |
Value |
|---|---|
BoundingBoxes |
To be allocated by the user. |
BoundingBoxesConfidence |
Optional: An array of single-precision (32-bit) floating-point numbers that contain the confidence values for each detected body box. To be allocated by the user. |
3D Body Pose Keypoint Tracking Property Values#
The following tables list the values for the configuration, input, and output properties for 3D Body Pose Keypoint Tracking racking.
Table 3‑17: Configuration Properties for 3D Body Pose Keypoint Tracking
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. |
CUDAStream |
The CUDA stream. Set by the user. |
ModelDir |
String that contains the path to the folder that contains the TensorRT package files. Set by the user. |
BatchSize |
The number of inferences to be run at one time on the GPU. The maximum value is 1. |
Mode |
Unsigned integer that specifies the mode: High Performance (1) or High Quality (0). Default is 1. Set by the user. |
UseCudaGraph |
Boolean to enable (true) or disable (false) the use of CUDA Graphs for optimization. Set by the user. |
Temporal |
Unsigned integer to enable (1) or disable (0) the temporal optimization of Body Pose tracking. Set by the user. |
NumKeyPoints |
Unsigned integer that specifies the number of keypoints available, which is currently 34. |
ReferencePose |
Set by the user. |
FullBodyOnly |
Unsigned integer to select the pose estimation mode:
The default is 1. Set by the user. |
PostprocessJointAngle |
Boolean to enable (true) or disable (false)
the postprocessing steps for joint angles
corresponding to the joints predicted with
low confidence. To be used only when
We recommend that you set this to true when input is upper-body image or video. The default is true. Set by the user. |
TargetSeatedPoseForInterpolation |
For the joints that are predicted with low confidence, the output pose will be interpolated to the corresponding pose specified in this target pose. This array is used when the SDK detects that the person in the input frame is in a seated pose. Used only when |
TargetStandPoseForInterpolation |
For the joints that are predicted with low confidence, the output pose will be interpolated to the corresponding pose specified in this target pose. This array is used when the SDK detects that the person in the input frame is in a standing pose. Used only when |
TrackPeople |
Unsigned integer to enable (1) or disable (0) multi-person tracking in Body Pose. Set by the user. |
ShadowTrackingAge |
Unsigned integer that specifies the age (in number of frames) after which the multi-person tracker stops tracking the object in shadow mode. The default is 90. Set by the user. |
ProbationAge |
Unsigned integer that specifies the age (in number of frames) after which the multi-person tracker marks the object valid and assigns an ID for tracking. The default is 10. Set by the user. |
MaxTargetsTracked |
Unsigned integer that specifies the maximum number of targets to be tracked by the multi-person tracker. After the tracking is complete, the new targets are discarded. The default is 30. Set by the user. |
Table 3‑18: Input Properties for 3D Body Pose Keypoint Tracking
Property Name |
Value |
|---|---|
Image |
Interleaved (or chunky) 8-bit BGR input
image in a CUDA buffer of type
To be allocated and set by the user. |
FocalLength |
Float value that specifies the focal length of the camera to be used for 3D Body Pose. The default value is 800.79041. To be allocated and set by the user. |
BoundingBoxes |
Optional: If not specified as an input property, body detection is automatically run on the input image. To be allocated by the user. |
Table 3‑19: Output Properties for 3D Body Pose Keypoint Tracking
Property Name |
Value |
|---|---|
Keypoints |
To be allocated by the user. |
Keypoints3D |
To be allocated by the user. |
JointAngles |
They represent the local rotation (in Quaternion) of each joint with reference to the ReferencePose. To be allocated by the user. |
KeyPointsConfidence |
An array of single-precision (32-bit)
floating-point numbers, which must be
large enough to hold the number of
confidence values given by the product of
To be allocated by the user. |
BoundingBoxes |
To be allocated by the user. |
TrackingBoundingBoxes |
To be allocated by the user. |
Facial Expression Estimation Property Values#
The following tables list the values for the configuration, input, and output properties for Facial Expression Estimation.
Table 3‑20: Configuration Properties for Facial Expression Estimation
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. Set by the user. |
CUDAStream |
Optional: The CUDA stream. Set by the user. |
Temporal |
Optional: Bitfield to control temporal filtering.
Default is 0x037 (all on except 0x100). Set by the user. |
Landmarks_Size |
Unsigned integer, 68 or 126. Required array size of detected facial landmark points. Length of array must be 126, to accommodate {x,y} location of each of the detected points. |
ExpressionCount |
Unsigned integer. The number of expressions in the face model. |
PoseMode |
Specifies how to compute pose. 0 = 3DOF (default), 1 = 6DOF explicit. 6DOF is required for 3D translation output. |
Mode |
Flag to toggle landmark mode. Set 0 to enable Performance model for landmark detection. Set 1 to enable Quality model for landmark detection for higher accuracy. Default is 1. |
EnableCheekPuff |
(Experimental) Enables cheek puff blendshapes. |
Table 3‑21: Input Properties for Facial Expression Estimation
Property Name |
Value |
|---|---|
Landmarks |
Optional: An If landmarks are not provided to this feature, an input image must be provided. To be allocated by the user. |
Image |
Optional: An interleaved (or chunky) 8-bit
BGR input image in a CUDA buffer of type
If an input image is not provided as input, the landmark points must be provided to this feature as input. To be allocated by the user. |
CameraIntrinsicParams |
Optional: Camera intrinsic parameters.
A three-element float array with elements
corresponding to focal length, cx, and cy,
respectively, of an ideal perspective
camera. Any barrel or fisheye distortion
should be removed or considered
negligible. Used only if |
Table 3‑22: Output Properties for Facial Expression Estimation
Property Name |
Value |
|---|---|
Landmarks |
Optional: An |
Pose |
Optional: To be allocated by the user. |
PoseTranslation |
Optional: To be allocated by the user. |
LandmarksConfidence |
Optional: An array of single-precision (32-bit)
floating-point numbers, which must be large enough
to hold the number of confidence values of size
To be allocated by the user. |
BoundingBoxes |
Optional: To be allocated by the user. |
BoundingBoxesConfidence |
Optional: An array of single-precision (32-bit)
floating-point numbers, which must be large enough
to hold the number of confidence values of size
To be allocated by the user. |
ExpressionCoefficients |
The array into which the expression coefficients will be placed, if desired. Query To be allocated by the user. The corresponding expression shapes are in the following order:
|
Video Live Portrait Property Values#
The following tables list the values for the configuration, input, and output properties for Live Portrait.
Table 3‑23: Configuration Properties for Video Live Portrait
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. Set by the user. |
CUDAStream |
Optional: The CUDA stream. Set by the user. |
ModelSel |
Model optimized for performance or for quality
Set by the user. |
Mode |
Video Live Portrait mode.
Set by the user. |
CheckFaceBox |
Flag for checking face bounding box status.
Set by the user. |
NetworkOutputImgWidth |
Width of the output image generated from the network (512 or 1024). |
NetworkOutputImgHeight |
Height of the output image generated from the network (512 or 1024). |
Table 3‑24: Input Properties for Video Live Portrait
Property Name |
Value |
|---|---|
SourceImage |
Chunky/packed 8-bit BGR or BGRA CUDA buffer. Requirements:
|
DriveImage |
Chunky/packed 8-bit BGR CUDA buffer. |
NeutralDriveImage |
Chunky/packed 8-bit BGR CUDA buffer. |
Table 3‑25: Output Properties for Video Live Portrait
Property Name |
Value |
|---|---|
GeneratedImage |
Chunky/packed 8-bit BGR or BGRA CPU/CUDA buffer. |
BoundingBoxes |
Optional: To be allocated by the user. |
FaceBoxStatus |
Output for detecting current status of face position within face box. 0: Face is inside the tracked bounding box. 1: Face is close to the border of the tracked bounding box. 2: Face is outside the tracked bounding box. |
Frame Selection Property Values#
The following tables list the values for the configuration, input, and output properties for Frame Selection.
Table 3‑26: Configuration Properties for Frame Selection
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. Set by the user. |
CUDAStream |
Optional: The CUDA stream. Set by the user. |
Temporal |
Optional: Bitfield to control temporal filtering.
Default: 0x037 (all on except 0x100). Set by the user. |
Mode |
Optional: Frame Selection mode.
Default: 0. Set by the user. |
ActiveDuration |
Optional: Specifies how long (in frames),
beginning with the first frame, frame selection
can report frame status (good or bad) before
reporting expired status
( If no good frame is detected in the first n
frames specified by Default: 0 (runs forever). Set by the user. |
GoodFrameMinInterval |
Optional: If two good frames are too close, we won’t report the latter one unless at least the specified number of frames are between the two good frames. Default: 0 (no gap frame needed between good frames). Set by the user. |
Strategy |
Optional: Flag to control frame selection strategy.
Default: 1. Set by the user. |
Table 3‑27: Input Properties for Frame Selection
Property Name |
Value |
|---|---|
Image |
Chunky/packed 8-bit BGR CUDA buffer. |
Table 3‑28: Output Properties for Frame Selection
Property Name |
Value |
|---|---|
FrameSelectorStatus |
Bitfield to indicate the current input image status.
To learn more about the status code, refer
to |
Speech Live Portrait Property Values#
The following tables list the values for the configuration, input, and output properties for Speech Live Portrait.
Table 3‑29: Configuration Properties for Speech Live Portrait
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. Set by the user. |
CUDAStream |
Optional: The CUDA stream. Set by the user. |
ModelSel |
Model optimized for performance or for quality.
Set by the user. |
Mode |
Speech Live Portrait mode.
If you see the head popping in and out
while using Set by the user. |
HeadPoseMode |
Select head animation.
Set by the user. |
SampleRate |
The sample rate for the audio input. The SDK currently supports 16-kHz audio only. |
NumChannels |
The number of channels for the audio input. The SDK currently supports mono channel audio only. |
SamplesPerFrame |
The number of samples per audio frame.
This is the number of audio samples
passed to |
NumInitialFrames |
The number of initial audio frames
before the first image can be generated.
You need to provide |
EnableLookAway |
Flag to enable gaze lookaway.
Set by user. |
LookAwayOffsetMax |
The maximum integer value of gaze offset when lookaway is enabled. Default: 20
Unit: Degrees
Set by user. |
LookAwayIntervalRange |
Range for picking the number of frames at which random look away occurs. Default: 90
Range: [1, 600]
Unit: Frames
Set by user. |
LookAwayIntervalMin |
Minimum limit for the number of frames at which random lookaway occurs. Default: 240
Range: [1, 600]
Unit: Frames
Set by user. |
BlinkFrequency |
The frequency of eye blinks per minute. Default: 15
Range: [0, 120]
Unit: Frames
0 = disable eye blink Set by user. |
BlinkDuration |
The duration of an eye blink. Default: 6
Range: [2, 150]
Unit: Frames
Set by user. |
MouthExpressionMultiplier |
Specifies the degree of exaggeration for mouth movements. Higher values result in more exaggerated mouth motions. Default: 1.4f
Range: [1.0f, 1.6f]
Set by user |
MouthExpressionBase |
Defines the base openness of the mouth when idle (that is, zero audio input). Higher values lead to a more open mouth appearance during the idle state. Default: 0.3f
Range: [0.0f, 1.0f]
Set by user |
HeadPoseMultiplier |
A multiplier to dampen range of Head Pose Animation. This is applicable only for
Default: 1.0f
Range: [0.0f, 1.0f]
Set by user |
Table 3‑30: Input Properties for Speech Live Portrait
Property Name |
Value |
|---|---|
SourceImage |
Chunky/packed 8-bit BGR or BGRA CUDA buffer. Requirements:
If the face is close to the border of the
image, in |
AudioFrameBuffer |
Raw audio frame buffer in CPU ranging from −1.0 to 1.0, inclusive. Audio requirements:
|
HeadPoseRotation |
The input format is [qx, qy, qz, qw]. If the input quaternion value is out-of-range, the value is clamped to ±20 degrees in Euler angle. |
HeadPoseTranslation |
The input format is [tx, ty, sz]. The range is [+-0.03, +-0.02, 0.97-1.03]. An out-of-range value will be clamped and logged as a warning. |
Table 3‑31: Output Properties for Speech Live Portrait
Property Name |
Value |
|---|---|
GeneratedImage |
Chunky/packed 8-bit BGR or BGRA CPU/CUDA buffer. If the face is close to the border of the
image, in |
BoundingBoxes |
Optional: To be allocated by the user. |
LipSync Property Values#
The following tables list the values for the configuration, input, and output properties for LipSync.
Table 3‑32: Configuration Properties for LipSync
Property Name |
Value |
|---|---|
FeatureDescription |
String that describes the feature. This property is read-only. |
ModelDir |
String that contains the path to the face model and the TensorRT package files. Set by the user. |
CUDAStream |
Optional: The CUDA stream. Set by the user. |
SampleRate |
The sample rate for the audio input. The SDK currently supports 16-kHz audio only. |
NumChannels |
The number of channels for the audio input. The SDK currently supports mono channel audio only. |
NumInitialFrames |
The number of initial audio frames
before the first image can be generated.
You need to provide The default value is 14. This property is read-only. |
Table 3‑33: Input Properties for LipSync
Property Name |
Value |
|---|---|
Image |
Chunky/packed 8-bit BGR or BGRA CUDA buffer. Requirements:
|
AudioFrameBuffer |
Raw audio frame buffer in CPU ranging from –1.0 to 1.0, inclusive. When the LipSync feature is run, it assumes that the contents of the audio frame buffer are synchronized with the current input video frame. The length of the audio frame should approximately match the duration of the video frame. The caller can vary the length of each audio frame to maintain synchronization. Audio requirements:
|
SpeakerData |
An This input is an alternative to using
|
CameraIntrinsicParams |
Optional: Camera intrinsic parameters. A three-element float array with elements corresponding to focal length, cx, and cy, respectively, of an ideal perspective camera. Any barrel or fisheye distortion should be removed or considered negligible. Default: {f=input_height,cx=input_width/2, cy=input_height/2}. |
Table 3‑34: Output Properties for LipSync
Property Name |
Value |
|---|---|
Image |
Chunky/packed 8-bit BGR or BGRA CPU/CUDA buffer. |
Ready |
Flag that is set to a non-zero value when the first output video frame is generated. |
Activation |
Floating-point value in the range 0–1 that indicates the activation level of LipSync in the output. When the activation is 0, it means the original face was copied directly to the output frame without modification. When the activation is 1, the original face was completely replaced by the animated face in the output frame. |