Processing math: 10%

VPI - Vision Programming Interface

2.4 Release

All Data Structures Files Functions Variables Typedefs Enumerations Enumerator Macros Modules Pages
Pinhole Camera Model

The pinhole camera model describes a camera that projects scene 3D points into the image plane by means of a perspective transformation. It is described by:

\begin{align*} s \mathsf{p} &= \mathsf{K} [ \mathsf{R} | \mathsf{t} ] \mathsf{P} \end{align*}

or

\begin{align*} s \begin{bmatrix} u \\ v \\ 1 \end{bmatrix} &= \begin{bmatrix} f_x & 0 & c_x \\ 0 & f_y & c_y \\ 0 & 0 & 1 \end{bmatrix} \begin{bmatrix} r_{11} & r_{12} & r_{13} & t_1 \\ r_{21} & r_{22} & r_{23} & t_2 \\ r_{31} & r_{32} & r_{33} & t_3 \end{bmatrix} \begin{bmatrix} X \\ Y \\ Z \\ 1 \end{bmatrix} \\ (x_d,y_d) &= L(\tilde{x},\tilde{y}) \end{align*}

where:

  • (X,Y,Z) are the coordinates of a 3D point in world space.
  • (u,v) are the coordinates (in pixels) of the projection of (X,Y,Z) on the image plane.
  • \mathsf{K} is a 3x3 matrix of intrinsic camera parameters.
  • [R|t] is a 3x4 matrix of extrinsic camera parameters, mapping world space to camera space. It is composed of a 3D rotation followed by translation.
  • (c_x,c_y) is the camera's principal point in pixels, where its origin is projected on the image plane. Usually is at the image center.
  • f_x,f_y are the camera's horizontal and vertical focal lengths, respectively, expressed in pixel units.
  • s is a scale factor.