Stereo Visual Odometry

The Isaac SDK includes Elbrus Visual Odometry, a codelet and library determining the 6 degrees of freedom: 3 for orientation and 3 for location, by constantly analyzing stereo camera information obtained from a video stream of images.

Tracking speed is effectively real-time, at least 30 fps for 640x480 video resolution. Accuracy is based on the KITTI benchmark

  • drift of ~1% in localization
  • error of 0.003 degrees/meter of motion

Elbrus allows robust tracking across different types of motion:

  • indoor
  • outdoor
  • aerial
  • HMD
  • automotive
  • robotics


Elbrus employs industry standard keyframes based the SLAM architecture. The end-to-end tracking pipeline contains two major components: 2D and 3D.

The 2D component performs detection of well-distributed 2D features (tracks) in the frames that it marks as a keyframes, and then tracks those feature locations in between keyframes using Lucas-Kanade (LK), a multi-pyramid algorithm that transfers found tracks as 2D positions with unique IDs for every track to the 3D component of the solver. In stereo tracking, the 3D component feeds back to LK tracker predictions based on current results of camera 3D tracking and a “velocity constancy” camera motion model. This improves and accelerates LK tracking and helps to purge 2D tracking outliers.

2D tracking is the most time-consuming part of the Elbrus tracking pipeline. In the case of stereo images, the 2D component consumes over 90% of the time in the CPU reference implementation. To overcome this obstacle to real-time tracking, the most time consuming 2D components (building image pyramid, smoothing spatial derivatives and other 2D convolution image processing routines) perform tracking of all frames from the left camera and between left and right camera images in the key frames of the video sequence.

The 3D component performs 3D tracking. It finds the left camera pose (6 DOF) using a combination of triangulation, resectioning (finding camera pose using known 3D locations - landmarks - and their 2D projections in the camera image plane - tracks) and Sparse Bundle Adjustments (SBA) for stereo cases. Camera poses and all 3D algorithmic work in Elbrus is done using Exponential Mapping (Twists) as 6-DOF camera pose representations.

Source Code

The Isaac SDK includes Elbrus tracker code in the form of a dynamic library, wrapped by a codelet. The Isaac codelet wrapping Elbrus stereo tracker takes a pair of input images, and camera intrinsics. The camera pose is represented by a quaternion and a translation ivector, relative to the location of the camera.

Running the Sample Application

The Elbrus sample application uses a ZED stereo camera. First, connect the ZED camera to the host system or the Jetson platform you are using.


Use the following procedures to run the sample application.

To Run the Sample Application on the Host System

  1. Build the sample application with the following command:

    bob@desktop:~/isaac$ bazel build apps/samples/stereo_vo
  2. Run the sample application with the following command:

    bob@desktop:~/isaac$ bazel run  apps/samples/stereo_vo

To Run the Application on Jetson

  1. Run the following command on the host computer, where <JETSON_IP> is the IP address of your Jetson system.

    bob@desktop:~/isaac$ ./engine/build/ -p //apps/samples/stereo_vo:stereo_vo-pkg -d jetpack42 -h <JETSON_IP>
  2. Log on to the Jetson system and run the application with the following commands:

    bob@jetson:~/$ cd deploy/bob/stereo_vo-pkg
    bob@jetson:~/deploy/bob/stereo_vo-pkg$ ./apps/samples/stereo_vo/stereo_vo

    Where “bob” is your user name on the host system.

To View Output from the Application in Websight

While the application is running, open Isaac Sight in a browser by navigating to http://localhost:3000. If you are running the application on a Jetson platform, make sure to use the IP address of the Jetson system instead of localhost.