Working with Features#
Creating an Instance of a Feature Type#
The feature type is a predefined structure that is used to access the SDK features. Each feature requires an instantiation of the feature type. Creating an instance of a feature type provides access to configuration parameters that are used when loading an instance of the feature type and to the input and output parameters that are provided at runtime when instances of the feature type are run.
Allocate memory for an NvAR_FeatureHandle structure.
NvAR_FeatureHandle faceDetectHandle{};Call the NvAR_Create function.
In the call to the function, pass the following information:
A value of the
NvAR_FeatureIDenumeration to identify the feature type.A pointer to the variable that you declared to allocate memory for an NvAR_FeatureHandle structure.
To create an instance of the face detection feature type, first include the feature header file at the top of your source file:
#include "nvARFaceBoxDetection.h"To create the feature instance, use the following call:
NvAR_Create(NvAR_Feature_FaceBoxDetection, &faceDetectHandle)This function creates a handle to the feature instance, which is required in function calls to get and set the properties of the instance and to load, run, or destroy the instance.
Each
NvAR_Feature_<feature-name>identifier is defined under the respective feature header; for example,nvARFaceBoxDetection.h. Ensure that the requested feature is installed so that the SDK has access to both feature libraries and required model files.During
NvAR_Create, the SDK loads the feature library from the location<ar-sdk>/features/<feature>/<bin/lib>. Without a feature installation, the call to this API function returns an error code.For instructions on how to install features, refer to the installation guide.
Getting and Setting Properties for a Feature Type#
To prepare to load and run an instance of a feature type, you need to set the properties that the instance requires.
Here are some of the properties:
The configuration properties that are required to load the feature type.
Input and output properties that are provided when instances of the feature type are run.
For a complete list of properties, refer to Key Values in the Properties of a Feature Type.
To set properties, the AR SDK provides type-safe set accessor functions. If you need the value of a property that has been set by a set accessor function, use the corresponding get accessor function. For a complete list of set and get functions, refer to Summary of AR SDK Accessor Functions.
Setting Up the CUDA Stream#
Some SDK features require a CUDA stream in which to run. For more information, refer to the NVIDIA CUDA Toolkit Documentation.
Initialize a CUDA stream by calling one of the following functions:
The CUDA Runtime API function
cudaStreamCreate().-
You can use the second function to avoid linking with the NVIDIA CUDA Toolkit libraries.
Call the NvAR_SetCudaStream function and provide the following information as parameters:
The created filter handle.
For more information, refer to Creating an Instance of a Feature Type.
The key value NvAR_Parameter_Config(CUDAStream).
For more information, refer to Key Values in the Properties of a Feature Type.
The CUDA stream that you created in the previous step.
The following example sets up a CUDA stream that was created by calling the NvAR_CudaStreamCreate function:
CUstream stream;
nvErr = NvAR_CudaStreamCreate(&stream);
nvErr = NvAR_SetCudaStream(featureHandle, NvAR_Parameter_Config(CUDAStream), stream);
Summary of AR SDK Accessor Functions#
Table 1‑1: Summary of AR SDK Accessor Functions
Property Type |
Data Type |
Set and Get Accessor Function |
|---|---|---|
32-bit unsigned integer |
unsigned int |
|
32-bit signed integer |
int |
|
Single-precision (32-bit) floating-point number |
float |
|
Double-precision (64-bit) floating point number |
double |
|
64-bit unsigned integer |
unsigned long long |
|
Floating-point array |
float* |
|
32-bit unsigned integer array |
unsigned int* |
|
Object |
void* |
|
Character string |
const char* |
|
CUDA stream |
CUstream |
|
Key Values in the Properties of a Feature Type#
The key values in the properties of a feature type identify the properties that can be used with each feature type. Each key has a string equivalent and is defined by a macro that indicates the category of the property and takes a name as an input to the macro.
The following macros indicate the category of a property:
NvAR_Parameter_Configindicates a configuration property.For more information, refer to Configuration Properties.
NvAR_Parameter_Inputindicates an input property.For more information, refer to Input Properties.
NvAR_Parameter_Outputindicates an output property.For more information, refer to Output Properties.
The keywords appear in various macros, depending on whether a property is an input, an output, or a configuration property.
The property type denotes the accessor functions to set and get the property, as listed in the Summary of AR SDK Accessor Functions table.
Configuration Properties#
Here are the configuration properties in the AR SDK:
- NvAR_Parameter_Config(FeatureDescription)
A description of the feature type.
String equivalent:NvAR_Parameter_Config_FeatureDescriptionProperty type: character string (const char*)- NvAR_Parameter_Config(CUDAStream)
The CUDA stream in which to run the feature.
String equivalent:NvAR_Parameter_Config_CUDAStreamProperty type: CUDA stream (CUstream)- NvAR_Parameter_Config(ModelDir)
The path to the directory that contains both the TensorRT model files to be used to run inference for face detection or landmark detection and the
.nvffile that contains the 3D Face model, excluding the model file name. For details about the format of the.nvffile, refer to Appendix A: NVIDIA 3DMM File Format.String equivalent:NvAR_Parameter_Config_ModelDirProperty type: character string (const char*)- NvAR_Parameter_Config(BatchSize)
The number of inferences to be run at one time on the GPU.
String equivalent:NvAR_Parameter_Config_BatchSizeProperty type: unsigned integer- NvAR_Parameter_Config(Landmarks_Size)
The length of the output buffer that contains the x and y coordinates in pixels of the detected landmarks. This property applies only to the landmark detection feature.
String equivalent:NvAR_Parameter_Config_Landmarks_SizeProperty type: unsigned integer- NvAR_Parameter_Config(LandmarksConfidence_Size)
The length of the output buffer that contains the confidence values of the detected landmarks. This property applies only to the landmark detection feature.
String equivalent:NvAR_Parameter_Config_LandmarksConfidence_SizeProperty type: unsigned integer- NvAR_Parameter_Config(Temporal)
Flag to enable optimization for temporal input frames. Enable the flag when the input is a video.
String equivalent:NvAR_Parameter_Config_TemporalProperty type: unsigned integer- NvAR_Parameter_Config(ShapeEigenValueCount)
The number of eigenvalues used to describe shape. In the supplied
face_model2.nvfare 100 shapes (also known as identity) eigenvalues, butShapeEigenValueCountshould be queried when you allocate an array to receive the eigenvalues.String equivalent:NvAR_Parameter_Config_ShapeEigenValueCountProperty type: unsigned integer- NvAR_Parameter_Config(ExpressionCount)
The number of coefficients used to represent expression. In the supplied
face_model2.nvfare 53 expression blendshape coefficients, butExpressionCountshould be queried when allocating an array to receive the coefficients.String equivalent:NvAR_Parameter_Config_ExpressionCountProperty type: unsigned integer- NvAR_Parameter_Config(UseCudaGraph)
Flag to enable CUDA Graph optimization. The CUDA graph reduces the overhead of GPU operation submission of 3D body tracking.
String equivalent:NvAR_Parameter_Config_UseCudaGraphProperty type: bool- NvAR_Parameter_Config(GazeRedirect)
Flag to enable the redirection of gaze, in addition to gaze estimation, in the eye contact feature. By default, this is set to true.
String equivalent:NvAR_Parameter_Config_GazeRedirectProperty type: bool- NvAR_Parameter_Config(Mode)
Mode to select High Performance or High Quality for 3D Body Pose or Facial Landmark Detection.
String equivalent:NvAR_Parameter_Config_ModeProperty type: unsigned integer- NvAR_Parameter_Config(ReferencePose)
CPU buffer of type
NvAR_Point3fto hold the Reference Pose for Joint Rotations for 3D Body Pose.String equivalent:NvAR_Parameter_Config_ReferencePoseProperty type: object (void*)- NvAR_Parameter_Config(FullBodyOnly)
Flag to select pose estimation mode in 3D body pose and body detection: Full Body only or Full and upper body pose estimation mode.
0: set to Full and upper body pose estimation. Supports only high quality mode in 3D body pose.
1: set to Full body pose estimation. Supports both high quality and high performance modes in 3D body pose.
String equivalent:NvAR_Parameter_Config_FullBodyOnlyProperty type: unsigned int- NvAR_Parameter_Config(PostprocessJointAngle)
Flag to enable or disable the postprocessing steps for joint angles corresponding to the joints predicted with low confidence in 3D body pose. To be used only when
FullBodyOnlyis set to 0. We recommend that you set this to true when input is upper body image or video.String equivalent:NvAR_Parameter_Config_PostprocessJointAngleProperty type: bool- NvAR_Parameter_Config(TargetSeatedPoseForInterpolation)
CPU buffer of type
NvAR_Quaternionto hold the target seated pose to be used for post-processing joint rotations for 3D Body Pose. For the joints that are predicted with low confidence, the output pose is interpolated to the corresponding pose specified in this target pose. To be used only whenFullBodyOnlyis set to 0.String equivalent:NvAR_Parameter_Config_TargetSeatedPoseForInterpolationProperty type: object (void*)- NvAR_Parameter_Config(TargetStandPoseForInterpolation)
CPU buffer of type
NvAR_Quaternionto hold the target standing pose to be used for post-processing joint rotations for 3D Body Pose. For the joints that are predicted with low confidence, the output pose is interpolated to the corresponding pose specified in this target pose. To be used only whenFullBodyOnlyis set to 0.String equivalent:NvAR_Parameter_Config_TargetStandPoseForInterpolationProperty type: object (void*)- NvAR_Parameter_Config(TrackPeople)
Flag to select Multi-Person Tracking for 3D Body Pose Tracking.
String equivalent:NvAR_Parameter_Config_TrackPeopleProperty type: unsigned integer- NvAR_Parameter_Config(ShadowTrackingAge)
The age after which the multi-person tracker no longer tracks the object in shadow mode. This property is measured in the number of frames.
String equivalent:NvAR_Parameter_Config_ShadowTrackingAgeProperty type: unsigned integer- NvAR_Parameter_Config(ProbationAge)
The age after which the multi-person tracker marks the object valid and assigns an ID for tracking. This property is measured in the number of frames.
String equivalent:NvAR_Parameter_Config_ProbationAgeProperty type: unsigned integer- NvAR_Parameter_Config(NetworkOutputImgWidth)
Width of the output image generated from the network (512 or 1024).
String equivalent:NvAR_Parameter_Config_NetworkOutputImgWidthProperty type: unsigned integer- NvAR_Parameter_Config(NetworkOutputImgHeight)
Height of the output image generated from the network (512 or 1024).
String equivalent:NvAR_Parameter_Config_NetworkOutputImgHeightProperty type: unsigned integer- NvAR_Parameter_Config(MaxTargetsTracked)
The maximum number of targets to be tracked by the multi-person tracker. After this limit is met, any new targets are discarded.
String equivalent:NvAR_Parameter_Config_MaxTargetsTrackedProperty type: unsigned integer
Input Properties#
Here are the input properties in the AR SDK:
- NvAR_Parameter_Input(Image)
GPU input image buffer of type
NvCVImage.String equivalent:NvAR_Parameter_Input_ImageProperty type: object (void*)- NvAR_Parameter_Input(Width)
The width of the input image buffer in pixels.
String equivalent:NvAR_Parameter_Input_WidthProperty type: integer- NvAR_Parameter_Input(Height)
The height of the input image buffer in pixels.
String equivalent:NvAR_Parameter_Input_HeightProperty type: integer- NvAR_Parameter_Input(Landmarks)
CPU input array of type
NvAR_Point2fthat contains the facial landmark points.String equivalent:NvAR_Parameter_Input_LandmarksProperty type: object (void*)- NvAR_Parameter_Input(BoundingBoxes)
Bounding boxes that determine the region of interest (ROI) of an input image that contains a face of type
NvAR_BBoxes.String equivalent:NvAR_Parameter_Input_BoundingBoxesProperty type: object (void*)- NvAR_Parameter_Input(FocalLength)
The focal length of the camera used for 3D Body Pose.
String equivalent:NvAR_Parameter_Input_FocalLengthProperty type: float
Output Properties#
Here are the output properties in the AR SDK:
- NvAR_Parameter_Output(BoundingBoxes)
CPU output bounding boxes of type
NvAR_BBoxes.String equivalent:NvAR_Parameter_Output_BoundingBoxesProperty type: object (void*)- NvAR_Parameter_Output(TrackingBoundingBoxes)
CPU output tracking bounding boxes of type
NvAR_TrackingBBoxes.String equivalent:NvAR_Parameter_Output_TrackingBBoxesProperty type: object (void*)- NvAR_Parameter_Output(BoundingBoxesConfidence)
Float array of confidence values for each returned bounding box.
String equivalent:NvAR_Parameter_Output_BoundingBoxesConfidenceProperty type: floating point array- NvAR_Parameter_Output(Landmarks)
CPU output buffer of type
NvAR_Point2fto hold the output detected landmark key points. Refer to Facial point annotations for more information. The order of the points in the CPU buffer follows the order in MultiPIE 68-point markups, and the 126 points cover more points along the cheeks, the eyes, and the laugh lines.String equivalent:NvAR_Parameter_Output_LandmarksProperty type: object (void*)- NvAR_Parameter_Output(LandmarksConfidence)
Float array of confidence values for each detected landmark point.
String equivalent:NvAR_Parameter_Output_LandmarksConfidenceProperty type: floating point array- NvAR_Parameter_Output(Pose)
CPU array of type
NvAR_Quaternionto hold the output-detected pose as an XYZW quaternion.String equivalent:NvAR_Parameter_Output_PoseProperty type: object (void*)- NvAR_Parameter_Output(FaceMesh)
CPU 3D face Mesh of type
NvAR_FaceMesh.String equivalent:NvAR_Parameter_Output_FaceMeshProperty type: object (void*)- NvAR_Parameter_Output(RenderingParams)
CPU output structure of type
NvAR_RenderingParamsthat contains the rendering parameters that might be used to render the 3D face mesh.String equivalent:NvAR_Parameter_Output_RenderingParamsProperty type: object (void*)- NvAR_Parameter_Output(ShapeEigenValues)
Float array of shape eigenvalues. Get
NvAR_Parameter_Config(ShapeEigenValueCount)to determine the count of eigenvalues.String equivalent:NvAR_Parameter_Output_ShapeEigenValuesProperty type: floating point array- NvAR_Parameter_Output(ExpressionCoefficients)
Float array of expression coefficients. Get
NvAR_Parameter_Config(ExpressionCount)to determine the count of coefficients.String equivalent:NvAR_Parameter_Output_ExpressionCoefficientsProperty type: floating point array- NvAR_Parameter_Output(KeyPoints)
CPU output buffer of type
NvAR_Point2fto hold the output detected 2D Keypoints for Body Pose. Refer to Appendix B: 3D Body Pose Keypoint Format for information about the Keypoint names and the order of Keypoint output.String equivalent:NvAR_Parameter_Output_KeyPointsProperty type: object (void*)- NvAR_Parameter_Output(KeyPoints3D)
CPU output buffer of type
NvAR_Point3fto hold the output detected 3D Keypoints for Body Pose. Refer to Appendix B: 3D Body Pose Keypoint Format for information about the Keypoint names and the order of Keypoint outputString equivalent:NvAR_Parameter_Output_KeyPoints3DProperty type: object (void*)- NvAR_Parameter_Output(JointAngles)
CPU output buffer of type
NvAR_Point3fto hold the joint angles in axis-angle format.String equivalent:NvAR_Parameter_Output_JointAnglesProperty type: object (void*)- NvAR_Parameter_Output(KeyPointsConfidence)
Float array of confidence values for each detected keypoint.
String equivalent:NvAR_Parameter_Output_KeyPointsConfidenceProperty type: floating point array- NvAR_Parameter_Output(OutputHeadTranslation)
Float array of three values that represent the x, y and z values of head translation with respect to the camera for Eye Contact.
String equivalent:NvAR_Parameter_Output_OutputHeadTranslationProperty type: floating point array- NvAR_Parameter_Output(OutputGazeVector)
Float array of two values that represent the yaw and pitch angles of the estimated gaze for Eye Contact.
String equivalent:NvAR_Parameter_Output_OutputGazeVectorProperty type: floating point array- NvAR_Parameter_Output(HeadPose)
CPU array of type
NvAR_Quaternionto hold the output-detected head pose as an XYZW quaternion in Eye Contact. This is an alternative to the head pose that was obtained from the facial landmarks feature. This head pose is obtained using the PnP algorithm over the landmarks.String equivalent:NvAR_Parameter_Output_HeadPoseProperty type: object (void*)- NvAR_Parameter_Output(GazeDirection)
Float array of two values that represent the yaw and pitch angles of the estimated gaze for Eye Contact.
String equivalent:NvAR_Parameter_Output_GazeDirectionProperty type: floating point array
Getting the Value of a Property of a Feature#
To get the value of a property of a feature, call the get accessor function that is appropriate for the data type of the property. In the call to the function, pass the following information:
The feature handle to the feature instance.
The key value that identifies the property that you are getting.
The location in memory where you want the value of the property to be written.
The following example determines the length of the NvAR_Point2f output buffer
that was returned by the landmark detection feature:
unsigned int OUTPUT_SIZE_KPTS;
NvAR_GetU32(landmarkDetectHandle, NvAR_Parameter_Config(Landmarks_Size),
&OUTPUT_SIZE_KPTS);
Setting a Property for a Feature#
To set a property for a feature:
Allocate memory for all inputs and outputs that are required by the feature and any other properties that might be required.
Call the set accessor function that is appropriate for the data type of the property.
In the call to the function, pass the following information:
The feature handle to the feature instance.
The key value that identifies the property that you are setting.
A pointer to the value to which you want to set the property.
The following example sets the file path to the file that contains the output 3D face model:
const char *modelPath = "file/path/to/model"; NvAR_SetString(landmarkDetectHandle, NvAR_Parameter_Config(ModelDir), modelPath);
The following example sets up the input image buffer in GPU memory, which is required by the face detection feature:
Note
The example sets up an 8-bit chunky/interleaved BGR array.
NvCVImage InputImageBuffer; NvCVImage_Alloc(&inputImageBuffer, input_image_width, input_image_height, NVCV_BGR, NVCV_U8, NVCV_CHUNKY, NVCV_GPU, 1) ; NvAR_SetObject(landmarkDetectHandle, NvAR_Parameter_Input(Image), &InputImageBuffer, sizeof(NvCVImage));
For more information about the properties and the input and output requirements for each feature, refer to Properties for the AR SDK Features.
Note
The listed property name is the input to the macro that defines the key value for the property.
Loading a Feature Instance#
You can load the feature after setting the configuration properties that are required to load an instance of a feature type.
To load a feature instance, call the NvAR_Load function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.
The following example loads an instance of the face detection feature type:
NvAR_Load(faceDetectHandle);
Running a Feature Instance#
Before you can run the feature instance, you must load an instance of a feature type and set the user-allocated input and output memory buffers that are required when the feature instance is run.
To run a feature instance, call the NvAR_Run function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.
The following example shows how to run a face detection feature instance:
NvAR_Run(faceDetectHandle);
Resetting a Feature Instance#
When a feature instance needs to be reset to its original state, as if it is the first frame which is being run on the instance, the reset functionality can be used. This does not reset the stream and config values but only resets those variables that have a temporal dependency with the previous frames. Currently, the reset functionality applies only to the Eye Contact feature and throws an exception with the other features. Note that the state handle parameter needs to be a nullptr for the feature to be reset.
The following example shows how to reset the Eye Contact feature instance:
NvAR_ResetState(eyeContactHandle, nullptr);
Destroying a Feature Instance#
When a feature instance is no longer required, you need to destroy it to free the resources and memory that the feature instance allocated internally. Memory buffers are provided as input and to hold the output of a feature and must be separately deallocated.
To destroy a feature instance, call the NvAR_Destroy function and specify the handle that was created for the feature instance when the instance was created. For more information, refer to Creating an Instance of a Feature Type.
NvAR_Destroy(faceDetectHandle);