3D Object Pose Estimation with DOPE

Deep Object Pose Estimation (DOPE) performs detection and 3D pose estimation of known objects from a single RGB image. It uses a deep learning approach to predict image keypoints for corners and centroid of an object’s 3D bounding box, and PnP postprocessing to estimate the 3D pose. This algorithm is different from the existing Pose CNN model; thus, it provides more diversity to the 3D pose estimation toolset in the Isaac SDK.



Isaac SDK package supports DOPE inference with TensorRT and C++ postprocessing with the DopeDecoder codelet. Each DOPE network model supports detection of multiple instances of a single object class.


The following command runs the dope inference app with a default sample model (with VGG19 pretrained on OpenImages v4) and image to detect YCB object cracker (003_cracker):

bob@desktop:~/isaac/sdk$ bazel run packages/object_pose_estimation/apps/dope:dope_inference

Open Sight at localhost:3000 to view the results.


Run the inference live with a Realsense camera with command:

bob@desktop:~/isaac/sdk$ bazel run packages/object_pose_estimation/apps/dope:dope_inference -- --mode realsense

Inference for other YCB objects

The Deep_Object_Pose github provides pre-trained torch models for additional YCB object (please note that these models use VGG19 from pytorch model zoo pretrained on ImageNet). To run the inference app with a different model, download the weights for tomato soup (YCB 005_tomator_soup_can) from Deep_Object_Pose github to /tmp/soup_60.pth. Convert the torch model to ONNX model with:

bob@desktop:~/isaac/sdk$ bazel run packages/object_pose_estimation/apps/dope:dope_model -- --input /tmp/soup_60.pth

This generates the ONNX model at /tmp/soup_60.onnx. Use the inference app with this model:

bob@desktop:~/isaac/sdk$ bazel run packages/object_pose_estimation/apps/dope:dope_inference -- --mode realsense --model /tmp/soup_60.onnx --box 0.06766 0.102 0.0677 --label soup

Note the 3D bounding box size of the new object must be provided with --box as input for the DopeDecoder Codelet to perform PnP. Bounding box sizes for the YCB models can be found in Deep_Object_Pose config or YCB paper.


DOPE training is currently not part of the Isaac SDK. Refer to the torch training script in Deep_Object_Pose github scripts/train.py.

For any pre-trained model hosted on Deep_Object_Pose github, or models trained with the training script in Deep_Object_Pose github, use:

bob@desktop:~/isaac/sdk$ bazel run packages/object_pose_estimation/apps/dope:dope_model -- --input <path to torch model>

to convert the torch model to ONNX model to use with Isaac SDK’s DOPE inference pipeline.

To train with custom objects, refer to Omniverse Isaac Sim’s synthetic data generation tools. The offline training data format is specified in the FAT dataset.