Working with Features#

Creating an Instance of a Feature Type#

The feature type is a predefined structure that is used to access the SDK features. Each feature requires an instantiation of the feature type. Creating an instance of a feature type provides access to configuration parameters that are used when loading an instance of the feature type and to the input and output parameters that are provided at runtime when instances of the feature type are run.

Allocate memory for an NvAR_FeatureHandle structure.

NvAR_FeatureHandle faceDetectHandle{};
Call the NvAR_Create function.

In the call to the function, pass the following information:
- A value of the NvAR_FeatureID enumeration to identify the feature type.
- A pointer to the variable that you declared to allocate memory for an NvAR_FeatureHandle structure.
To create an instance of the face detection feature type, first include the feature header file at the top of your source file:

#include "nvARFaceBoxDetection.h"

To create the feature instance, use the following call:

NvAR_Create(NvAR_Feature_FaceBoxDetection, &faceDetectHandle)

This function creates a handle to the feature instance, which is required in function calls to get and set the properties of the instance and to load, run, or destroy the instance.

Each NvAR_Feature_<feature-name> identifier is defined under the respective feature header; for example, nvARFaceBoxDetection.h. Ensure that the requested feature is installed so that the SDK has access to both feature libraries and required model files.

During NvAR_Create, the SDK loads the feature library from the location <ar-sdk>/features/<feature>/<bin/lib>. Without a feature installation, the call to this API function returns an error code.

For instructions on how to install features, refer to the installation guide.

Getting and Setting Properties for a Feature Type#

To prepare to load and run an instance of a feature type, you need to set the properties that the instance requires.

Here are some of the properties:

The configuration properties that are required to load the feature type.
Input and output properties that are provided when instances of the feature type are run.

For a complete list of properties, refer to Key Values in the Properties of a Feature Type.

To set properties, the AR SDK provides type-safe set accessor functions. If you need the value of a property that has been set by a set accessor function, use the corresponding get accessor function. For a complete list of set and get functions, refer to Summary of AR SDK Accessor Functions.

Setting Up the CUDA Stream#

Some SDK features require a CUDA stream in which to run. For more information, refer to the NVIDIA CUDA Toolkit Documentation.

Initialize a CUDA stream by calling one of the following functions:
- The CUDA Runtime API function cudaStreamCreate().
- NvAR_CudaStreamCreate.
  
  You can use the second function to avoid linking with the NVIDIA CUDA Toolkit libraries.
Call the NvAR_SetCudaStream function and provide the following information as parameters:
- The created filter handle.
  
  For more information, refer to Creating an Instance of a Feature Type.
- The key value NvAR_Parameter_Config(CUDAStream).
  
  For more information, refer to Key Values in the Properties of a Feature Type.
- The CUDA stream that you created in the previous step.

The following example sets up a CUDA stream that was created by calling the NvAR_CudaStreamCreate function:

CUstream stream;
nvErr = NvAR_CudaStreamCreate(&stream);
nvErr = NvAR_SetCudaStream(featureHandle, NvAR_Parameter_Config(CUDAStream), stream);

Summary of AR SDK Accessor Functions#

Table 1‑1: Summary of AR SDK Accessor Functions

Property Type	Data Type	Set and Get Accessor Function
32-bit unsigned integer	unsigned int	NvAR_SetU32() NvAR_GetU32()
32-bit signed integer	int	NvAR_SetS32() NvAR_GetS32()
Single-precision (32-bit) floating-point number	float	NvAR_SetF32() NvAR_GetF32()
Double-precision (64-bit) floating point number	double	NvAR_SetF64() NvAR_GetF64()
64-bit unsigned integer	unsigned long long	NvAR_SetU64() NvAR_GetU64()
Floating-point array	float*	NvAR_SetFloatArray() NvAR_GetFloatArray()
32-bit unsigned integer array	unsigned int*	NvAR_SetU32Array() NvAR_GetU32Array()
Object	void*	NvAR_SetObject() NvAR_GetObject()
Character string	const char*	NvAR_SetString() NvAR_GetString()
CUDA stream	CUstream	NvAR_SetCudaStream() NvAR_GetCudaStream()

Key Values in the Properties of a Feature Type#

The key values in the properties of a feature type identify the properties that can be used with each feature type. Each key has a string equivalent and is defined by a macro that indicates the category of the property and takes a name as an input to the macro.

The following macros indicate the category of a property:

NvAR_Parameter_Config indicates a configuration property.

For more information, refer to Configuration Properties.
NvAR_Parameter_Input indicates an input property.

For more information, refer to Input Properties.
NvAR_Parameter_Output indicates an output property.

For more information, refer to Output Properties.

The keywords appear in various macros, depending on whether a property is an input, an output, or a configuration property.

The property type denotes the accessor functions to set and get the property, as listed in the Summary of AR SDK Accessor Functions table.

Configuration Properties#

Here are the configuration properties in the AR SDK:

NvAR_Parameter_Config(FeatureDescription)

A description of the feature type.

String equivalent: NvAR_Parameter_Config_FeatureDescription

Property type: character string (const char*)

NvAR_Parameter_Config(CUDAStream)

The CUDA stream in which to run the feature.

String equivalent: NvAR_Parameter_Config_CUDAStream

Property type: CUDA stream (CUstream)

NvAR_Parameter_Config(ModelDir)

The path to the directory that contains both the TensorRT model files to be used to run inference for face detection or landmark detection and the .nvf file that contains the 3D Face model, excluding the model file name. For details about the format of the .nvf file, refer to Appendix A: NVIDIA 3DMM File Format.

String equivalent: NvAR_Parameter_Config_ModelDir

Property type: character string (const char*)

NvAR_Parameter_Config(BatchSize)

The number of inferences to be run at one time on the GPU.

String equivalent: NvAR_Parameter_Config_BatchSize

Property type: unsigned integer

NvAR_Parameter_Config(Landmarks_Size)

The length of the output buffer that contains the x and y coordinates in pixels of the detected landmarks. This property applies only to the landmark detection feature.

String equivalent: NvAR_Parameter_Config_Landmarks_Size

Property type: unsigned integer

NvAR_Parameter_Config(LandmarksConfidence_Size)

The length of the output buffer that contains the confidence values of the detected landmarks. This property applies only to the landmark detection feature.

String equivalent: NvAR_Parameter_Config_LandmarksConfidence_Size

Property type: unsigned integer

NvAR_Parameter_Config(Temporal)

Flag to enable optimization for temporal input frames. Enable the flag when the input is a video.

String equivalent: NvAR_Parameter_Config_Temporal

Property type: unsigned integer

NvAR_Parameter_Config(ShapeEigenValueCount)

The number of eigenvalues used to describe shape. In the supplied face_model2.nvf are 100 shapes (also known as identity) eigenvalues, but ShapeEigenValueCount should be queried when you allocate an array to receive the eigenvalues.

String equivalent: NvAR_Parameter_Config_ShapeEigenValueCount

Property type: unsigned integer

NvAR_Parameter_Config(ExpressionCount)

The number of coefficients used to represent expression. In the supplied face_model2.nvf are 53 expression blendshape coefficients, but ExpressionCount should be queried when allocating an array to receive the coefficients.

String equivalent: NvAR_Parameter_Config_ExpressionCount

Property type: unsigned integer

NvAR_Parameter_Config(UseCudaGraph)

Flag to enable CUDA Graph optimization. The CUDA graph reduces the overhead of GPU operation submission of 3D body tracking.

String equivalent: NvAR_Parameter_Config_UseCudaGraph

Property type: bool

NvAR_Parameter_Config(GazeRedirect)

Flag to enable the redirection of gaze, in addition to gaze estimation, in the eye contact feature. By default, this is set to true.

String equivalent: NvAR_Parameter_Config_GazeRedirect

Property type: bool

NvAR_Parameter_Config(Mode)

Mode to select High Performance or High Quality for 3D Body Pose or Facial Landmark Detection.

String equivalent: NvAR_Parameter_Config_Mode

Property type: unsigned integer

NvAR_Parameter_Config(ReferencePose)

CPU buffer of type NvAR_Point3f to hold the Reference Pose for Joint Rotations for 3D Body Pose.

String equivalent: NvAR_Parameter_Config_ReferencePose

Property type: object (void*)

NvAR_Parameter_Config(FullBodyOnly)

Flag to select pose estimation mode in 3D body pose and body detection: Full Body only or Full and upper body pose estimation mode.

0: set to Full and upper body pose estimation. Supports only high quality mode in 3D body pose.
1: set to Full body pose estimation. Supports both high quality and high performance modes in 3D body pose.

String equivalent: NvAR_Parameter_Config_FullBodyOnly

Property type: unsigned int

NvAR_Parameter_Config(PostprocessJointAngle)

Flag to enable or disable the postprocessing steps for joint angles corresponding to the joints predicted with low confidence in 3D body pose. To be used only when FullBodyOnly is set to 0. We recommend that you set this to true when input is upper body image or video.

String equivalent: NvAR_Parameter_Config_PostprocessJointAngle

Property type: bool

NvAR_Parameter_Config(TargetSeatedPoseForInterpolation)

CPU buffer of type NvAR_Quaternion to hold the target seated pose to be used for post-processing joint rotations for 3D Body Pose. For the joints that are predicted with low confidence, the output pose is interpolated to the corresponding pose specified in this target pose. To be used only when FullBodyOnly is set to 0.

String equivalent: NvAR_Parameter_Config_TargetSeatedPoseForInterpolation

Property type: object (void*)

NvAR_Parameter_Config(TargetStandPoseForInterpolation)

CPU buffer of type NvAR_Quaternion to hold the target standing pose to be used for post-processing joint rotations for 3D Body Pose. For the joints that are predicted with low confidence, the output pose is interpolated to the corresponding pose specified in this target pose. To be used only when FullBodyOnly is set to 0.

String equivalent: NvAR_Parameter_Config_TargetStandPoseForInterpolation

Property type: object (void*)

NvAR_Parameter_Config(TrackPeople)

Flag to select Multi-Person Tracking for 3D Body Pose Tracking.

String equivalent: NvAR_Parameter_Config_TrackPeople

Property type: unsigned integer

NvAR_Parameter_Config(ShadowTrackingAge)

The age after which the multi-person tracker no longer tracks the object in shadow mode. This property is measured in the number of frames.

String equivalent: NvAR_Parameter_Config_ShadowTrackingAge

Property type: unsigned integer

NvAR_Parameter_Config(ProbationAge)

The age after which the multi-person tracker marks the object valid and assigns an ID for tracking. This property is measured in the number of frames.

String equivalent: NvAR_Parameter_Config_ProbationAge

Property type: unsigned integer

NvAR_Parameter_Config(NetworkOutputImgWidth)

Width of the output image generated from the network (512 or 1024).

String equivalent: NvAR_Parameter_Config_NetworkOutputImgWidth

Property type: unsigned integer

NvAR_Parameter_Config(NetworkOutputImgHeight)

Height of the output image generated from the network (512 or 1024).

String equivalent: NvAR_Parameter_Config_NetworkOutputImgHeight

Property type: unsigned integer

NvAR_Parameter_Config(MaxTargetsTracked)

The maximum number of targets to be tracked by the multi-person tracker. After this limit is met, any new targets are discarded.

String equivalent: NvAR_Parameter_Config_MaxTargetsTracked

Property type: unsigned integer

Input Properties#

Here are the input properties in the AR SDK:

NvAR_Parameter_Input(Image): GPU input image buffer of type NvCVImage.

String equivalent: NvAR_Parameter_Input_Image

Property type: object (void*)
NvAR_Parameter_Input(Width): The width of the input image buffer in pixels.

String equivalent: NvAR_Parameter_Input_Width

Property type: integer
NvAR_Parameter_Input(Height): The height of the input image buffer in pixels.

String equivalent: NvAR_Parameter_Input_Height

Property type: integer
NvAR_Parameter_Input(Landmarks): CPU input array of type NvAR_Point2f that contains the facial landmark points.

String equivalent: NvAR_Parameter_Input_Landmarks

Property type: object (void*)
NvAR_Parameter_Input(BoundingBoxes): Bounding boxes that determine the region of interest (ROI) of an input image that contains a face of type NvAR_BBoxes.

String equivalent: NvAR_Parameter_Input_BoundingBoxes

Property type: object (void*)
NvAR_Parameter_Input(FocalLength): The focal length of the camera used for 3D Body Pose.

String equivalent: NvAR_Parameter_Input_FocalLength

Property type: float

Output Properties#

Here are the output properties in the AR SDK:

NvAR_Parameter_Output(BoundingBoxes): CPU output bounding boxes of type NvAR_BBoxes.

String equivalent: NvAR_Parameter_Output_BoundingBoxes

Property type: object (void*)
NvAR_Parameter_Output(TrackingBoundingBoxes): CPU output tracking bounding boxes of type NvAR_TrackingBBoxes.

String equivalent: NvAR_Parameter_Output_TrackingBBoxes

Property type: object (void*)
NvAR_Parameter_Output(BoundingBoxesConfidence): Float array of confidence values for each returned bounding box.

String equivalent: NvAR_Parameter_Output_BoundingBoxesConfidence

Property type: floating point array
NvAR_Parameter_Output(Landmarks): CPU output buffer of type NvAR_Point2f to hold the output detected landmark key points. Refer to Facial point annotations for more information. The order of the points in the CPU buffer follows the order in MultiPIE 68-point markups, and the 126 points cover more points along the cheeks, the eyes, and the laugh lines.

String equivalent: NvAR_Parameter_Output_Landmarks

Property type: object (void*)
NvAR_Parameter_Output(LandmarksConfidence): Float array of confidence values for each detected landmark point.

String equivalent: NvAR_Parameter_Output_LandmarksConfidence

Property type: floating point array
NvAR_Parameter_Output(Pose): CPU array of type NvAR_Quaternion to hold the output-detected pose as an XYZW quaternion.

String equivalent: NvAR_Parameter_Output_Pose

Property type: object (void*)
NvAR_Parameter_Output(FaceMesh): CPU 3D face Mesh of type NvAR_FaceMesh.

String equivalent: NvAR_Parameter_Output_FaceMesh

Property type: object (void*)
NvAR_Parameter_Output(RenderingParams): CPU output structure of type NvAR_RenderingParams that contains the rendering parameters that might be used to render the 3D face mesh.

String equivalent: NvAR_Parameter_Output_RenderingParams

Property type: object (void*)
NvAR_Parameter_Output(ShapeEigenValues): Float array of shape eigenvalues. Get NvAR_Parameter_Config(ShapeEigenValueCount) to determine the count of eigenvalues.

String equivalent: NvAR_Parameter_Output_ShapeEigenValues

Property type: floating point array
NvAR_Parameter_Output(ExpressionCoefficients): Float array of expression coefficients. Get NvAR_Parameter_Config(ExpressionCount) to determine the count of coefficients.

String equivalent: NvAR_Parameter_Output_ExpressionCoefficients

Property type: floating point array
NvAR_Parameter_Output(KeyPoints): CPU output buffer of type NvAR_Point2f to hold the output detected 2D Keypoints for Body Pose. Refer to Appendix B: 3D Body Pose Keypoint Format for information about the Keypoint names and the order of Keypoint output.

String equivalent: NvAR_Parameter_Output_KeyPoints

Property type: object (void*)
NvAR_Parameter_Output(KeyPoints3D): CPU output buffer of type NvAR_Point3f to hold the output detected 3D Keypoints for Body Pose. Refer to Appendix B: 3D Body Pose Keypoint Format for information about the Keypoint names and the order of Keypoint output

String equivalent: NvAR_Parameter_Output_KeyPoints3D

Property type: object (void*)
NvAR_Parameter_Output(JointAngles): CPU output buffer of type NvAR_Point3f to hold the joint angles in axis-angle format.

String equivalent: NvAR_Parameter_Output_JointAngles

Property type: object (void*)
NvAR_Parameter_Output(KeyPointsConfidence): Float array of confidence values for each detected keypoint.

String equivalent: NvAR_Parameter_Output_KeyPointsConfidence

Property type: floating point array
NvAR_Parameter_Output(OutputHeadTranslation): Float array of three values that represent the x, y and z values of head translation with respect to the camera for Eye Contact.

String equivalent: NvAR_Parameter_Output_OutputHeadTranslation

Property type: floating point array
NvAR_Parameter_Output(OutputGazeVector): Float array of two values that represent the yaw and pitch angles of the estimated gaze for Eye Contact.

String equivalent: NvAR_Parameter_Output_OutputGazeVector

Property type: floating point array
NvAR_Parameter_Output(HeadPose): CPU array of type NvAR_Quaternion to hold the output-detected head pose as an XYZW quaternion in Eye Contact. This is an alternative to the head pose that was obtained from the facial landmarks feature. This head pose is obtained using the PnP algorithm over the landmarks.

String equivalent: NvAR_Parameter_Output_HeadPose

Property type: object (void*)
NvAR_Parameter_Output(GazeDirection): Float array of two values that represent the yaw and pitch angles of the estimated gaze for Eye Contact.

String equivalent: NvAR_Parameter_Output_GazeDirection

Property type: floating point array

Getting the Value of a Property of a Feature#

To get the value of a property of a feature, call the get accessor function that is appropriate for the data type of the property. In the call to the function, pass the following information:

The feature handle to the feature instance.
The key value that identifies the property that you are getting.
The location in memory where you want the value of the property to be written.

The following example determines the length of the NvAR_Point2f output buffer that was returned by the landmark detection feature:

unsigned int OUTPUT_SIZE_KPTS;

NvAR_GetU32(landmarkDetectHandle, NvAR_Parameter_Config(Landmarks_Size),
&OUTPUT_SIZE_KPTS);

Setting a Property for a Feature#

To set a property for a feature:

Allocate memory for all inputs and outputs that are required by the feature and any other properties that might be required.
Call the set accessor function that is appropriate for the data type of the property.

In the call to the function, pass the following information:
- The feature handle to the feature instance.
- The key value that identifies the property that you are setting.
- A pointer to the value to which you want to set the property.
The following example sets the file path to the file that contains the output 3D face model:
```
const char *modelPath = "file/path/to/model";

NvAR_SetString(landmarkDetectHandle, NvAR_Parameter_Config(ModelDir),
modelPath);
```
The following example sets up the input image buffer in GPU memory, which is required by the face detection feature:

Note

The example sets up an 8-bit chunky/interleaved BGR array.
```
NvCVImage InputImageBuffer;

NvCVImage_Alloc(&inputImageBuffer, input_image_width,
input_image_height, NVCV_BGR, NVCV_U8, NVCV_CHUNKY, NVCV_GPU, 1) ;

NvAR_SetObject(landmarkDetectHandle, NvAR_Parameter_Input(Image),
&InputImageBuffer, sizeof(NvCVImage));
```

For more information about the properties and the input and output requirements for each feature, refer to Properties for the AR SDK Features.

Note

The listed property name is the input to the macro that defines the key value for the property.

Loading a Feature Instance#

You can load the feature after setting the configuration properties that are required to load an instance of a feature type.

To load a feature instance, call the NvAR_Load function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.

The following example loads an instance of the face detection feature type:

NvAR_Load(faceDetectHandle);

Running a Feature Instance#

Before you can run the feature instance, you must load an instance of a feature type and set the user-allocated input and output memory buffers that are required when the feature instance is run.

To run a feature instance, call the NvAR_Run function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.

The following example shows how to run a face detection feature instance:

NvAR_Run(faceDetectHandle);

Resetting a Feature Instance#

When a feature instance needs to be reset to its original state, as if it is the first frame which is being run on the instance, the reset functionality can be used. This does not reset the stream and config values but only resets those variables that have a temporal dependency with the previous frames. Currently, the reset functionality applies only to the Eye Contact feature and throws an exception with the other features. Note that the state handle parameter needs to be a nullptr for the feature to be reset.

The following example shows how to reset the Eye Contact feature instance:

NvAR_ResetState(eyeContactHandle, nullptr);

Destroying a Feature Instance#

When a feature instance is no longer required, you need to destroy it to free the resources and memory that the feature instance allocated internally. Memory buffers are provided as input and to hold the output of a feature and must be separately deallocated.

To destroy a feature instance, call the NvAR_Destroy function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.

NvAR_Destroy(faceDetectHandle);