Image Warping

Real-world captured imagery has certain imperfections, including:

  • Geometric distortion (and other optical aberrations)
  • Noise
  • Vignetting
  • Color imbalances

This section describes available resources for correcting geometric distortion.

Most computer vision algorithms are insensitive to vignetting and color balance. They are designed to be robust with respect to noise, but they tend to rely on ideal perspective images, otherwise known as pinhole images, that do not exhibit distortion.

The following is an example of a geometrically distorted, or warped, image, followed by an example of correction.



IACHR building. Credit: OAS



Geometric distortion is commonly categorized into two forms: radial and tangential.

Radial distortion is inherently radially symmetric, and is common in simple lenses, in order to minimize other optical aberrations such as astigmatism, chromatic aberration, coma, field curvature, and spherical aberration. Compound lenses have much less distortion because they can minimize all of these aberrations simultaneously, but because they are large and expensive they are rarely found in embedded applications like robotics. Geometric distortion can be modeled with a univariate polynomial and is easy to correct in software after acquisition.

Tangential distortion is not radially symmetric, although it is symmetric with respect to a line radiating from the center of projection. Tangential distortion arises from misaligned elements in a compound lens, so is negligible in the simple lenses found in embedded applications. It is modeled with a bivariate polynomial, and is also easy to correct in software.

There are two fundamental types of projections acquired with typical cameras and lenses: perspective/rectilinear/planar and fisheye/equidistant/spherical. The perspective lenses have a practical maximum field of view (FOV) of 120°, whereas fisheye lenses can acquire greater than 180°, with some exceeding 220°.

Radial distortion can be used to represent a fisheye lens with a perspective lens and vice versa – up to a certain FOV. The ideal lenses are related by the tangent (or arctangent) function, which can be well approximated by a polynomial close to the center of projection, but diverge rapidly further away. It is best to use the projection model that most closely matches the lens. As a rule of thumb, perspective should be used as the base projection for FOV < 90°, and fisheye for FOV > 120°.

The Isaac SDK accomodates both radial and tangential distortion correction for perspective lenses, but only radial distortion correction for fisheye lenses.

Geometric distortion is implemented as the sum of the radial and tangential distortion corrections. The Isaac SDK uses the Brown model, which is also used by OpenCV. There is a different ordering of the radial and tangential coefficients, though. The Isaac SDK uses \(\left\{ k_{0},k_{1},k_{2},k_{3},k_{4} \right\}\), where OpenCV uses \(\left\{ k_{0},k_{1},k_{3},k_{4},k_{2} \right\}\). OpenCV can be used to calibrate the intrinsic parameters of a camera, including these coefficients.

Radial Distortion Correction

Radial distortion is corrected with the following equation:

\[\begin{split}{x_{o} = x_{i} + x_{i}\left( k_{0}r_{i}^{2} + k_{1}r_{i}^{4} + k_{2}r_{i}^{6} \right) } \\ {y_{o} = y_{i} + y_{i}\left( k_{0}r_{i}^{2} + k_{1}r_{i}^{4} + k_{2}r_{i}^{6} \right) }\end{split}\]


\[r_{i} = \sqrt{x_{i}^{2} + y_{i}^{2}}\]

Tangential Distortion Correction

Tangential distortion is corrected with the equation:

\[\begin{split}{x_{o} = x_{i} + 2k_{3}x_{i}y_{i} + k_{4}\left( r_{i}^{2} + 2x_{i}^{2} \right) } \\ {y_{o} = y_{i} + k_{3}\left( r_{i}^{2} + 2y_{i}^{2} \right) + 2k_{4}x_{i}y_{i}}\end{split}\]

In fisheye distortion correction, we only accommodate radial distortion, but use a higher order polynomial:

\[\begin{split}{x_{o} = {x_{i} + x}_{i}\left( k_{0}r_{i}^{2} + k_{1}r_{i}^{4} + k_{2}r_{i}^{6} + k_{3}r_{i}^{8} \right) } \\ {y_{o} = y_{i} + y_{i}\left( k_{0}r_{i}^{2} + k_{1}r_{i}^{4} + k_{2}r_{i}^{6} + k_{3}r_{i}^{8} \right) }\end{split}\]

This is also the model used by OpenCV.

The other intrinsic parameter of a camera besides distortion include focal length and principal point.

Focal Length

The focal length of a digital camera is measured in pixels, though it can be considered to be pixels/radian. The projection equation for an ideal perspective lens is:

\[r = f\ \mathrm{\tan}\ \vartheta\]

and for an ideal fisheye lens is:

\[r = f\ \vartheta\]

Where \(r\) is the distance from the center of projection and \({\theta}\) is the inclination angle from the optical axis. It should be obvious from these equations that the focal length is equal to the angular pixel density at the center of projection – thus the interpretation as pixels/radian.

Principal Point (Center of Projection)

The point where the optical axis of the lens intersects the imaging plane is known as the principal point or center of projection. Ideally, this would be located at the center of the acquired image, but this has been seen to differ by as much as 100 pixels in commercial cameras.

The warping facilities are general, and include parameters of the resultant output image, including focal length, principal point, and orientation, as well as image size.

Output Principal Point

Distortion correction changes the shape of the input rectangle so that it bows either convex or concave. This bowing takes place with respect to the principal point, so it is a good idea to keep the output and input principal points in the same location relative to their centers, to yield symmetric clipping or exposed transparencies.

Output Focal Length

To maintain the same resolution on the output as the input, their focal lengths should be approximately the same. The Warp API has a separate horizontal and vertical focal length, and the different focal lengths can be maintained in the output as well. Another option is to use the geometric average of the two input focal lengths for both output focal lengths. A third option is to reduce the output focal length to 70%, to undo the interpolation that occurred in acquiring a color image using a Bayer sensor, yielding the same detail with less pixels to enhance throughput.

Output Image Size

Correcting for distortion and changing the focal length will affect the size of the projected image. A reasonable choice is to keep the image size, focal lengths, and principal point of the output to be the same as the input.

Output Orientation

In the case of monocular vision, it suffices to orient the output projection in the same direction as the input. However, for stereoscopic vision, if both cameras are pointing in the same direction and have the same focal length, one can use algorithms that are significantly faster and more robust. Either a rotation matrix or a set of Euler angles can specify the rotational correction.

The Isaac undistortion codelet takes a Color Camera proto and generates another Color Camera proto. Each of these protos include both the camera intrinsic parameters and an image. The only difference between the input and output intrinsic parameters is that the output distortion is 0. Also, if the source image was a fisheye, it is converted into a perspective image.

© Copyright 2018-2020, NVIDIA Corporation. Last updated on Feb 1, 2023.