VPI - Vision Programming Interface

0.4.4 Release

Common Types

Defines common types used by several components. More...

Data Structures

struct  VPIKeypoint
 Stores a keypoint coordinate. More...
 
struct  VPIHomographyTransform2D
 Stores a generic 2D homography transform. More...
 
struct  VPIBoundingBox
 Stores a generic 2D bounding box. More...
 

Enumerations

enum  VPIBoundaryCond
 Image boundary condition specify how pixel values outside of the image domain should be constructed. More...
 
enum  VPIInterpolationType
 Interpolation types supported by several algorithms. More...
 
enum  VPILockMode
 Defines the lock modes used by memory lock functions. More...
 

Detailed Description

Defines common types used by several components.


Data Structure Documentation

◆ VPIKeypoint

struct VPIKeypoint

Stores a keypoint coordinate.

The coordinate is relative to the top-left corner of an image.

Definition at line 353 of file Types.h.

+ Collaboration diagram for VPIKeypoint:
Data Fields
float x Keypoint's x coordinate.
float y Keypoint's y coordinate.

◆ VPIHomographyTransform2D

struct VPIHomographyTransform2D

Stores a generic 2D homography transform.

When only scale and translation transformation is needed, these parameters must be arranged in the matrix as follows:

\[ \begin{bmatrix} s_x & 0 & p_x \\ 0 & s_y & p_y \\ 0 & 0 & 1 \end{bmatrix} \]

Scaling \((s_x,s_y)\) is relative to the center of the patch, position \((p_x,p_y)\) is relative to the top-left of the image.

In the general case, given an homogeneous 2D point \(P(x,y,1)\) and the matrix \(M^{3x3}\), the Euclidean 2D point \(O(x,y)\) is defined as

\begin{align} T &= M \cdot P \\ O &= (T_x/T_z, T_y/T_z) \end{align}

Definition at line 383 of file Types.h.

+ Collaboration diagram for VPIHomographyTransform2D:
Data Fields
float mat3[3][3] 3x3 homogeneous matrix that defines the homography.

◆ VPIBoundingBox

struct VPIBoundingBox

Stores a generic 2D bounding box.

Although this structure can store a 2D bounding box transformed by any homography, most of the time it stores an axis-aligned bounding box. To retrieve it, do the following:

float x = xform.mat3[0][2];
float y = xform.mat3[1][2];
float w = width * xform.mat3[0][0];
float h = height * xform.mat3[1][1];

Definition at line 403 of file Types.h.

+ Collaboration diagram for VPIBoundingBox:
Data Fields
float height Bounding box height.
float width Bounding box width.
VPIHomographyTransform2D xform Defines the bounding box top left corner and its homography.

Enumeration Type Documentation

◆ VPIBoundaryCond

#include <vpi/Types.h>

Image boundary condition specify how pixel values outside of the image domain should be constructed.

Enumerator
VPI_BOUNDARY_COND_ZERO 

All pixels outside the image are considered to be zero.

VPI_BOUNDARY_COND_CLAMP 

Border pixels are repeated indefinitely.

Definition at line 216 of file Types.h.

◆ VPIInterpolationType

#include <vpi/Types.h>

Interpolation types supported by several algorithms.

Enumerator
VPI_INTERP_NEAREST 

Nearest neighbor interpolation.

\[ P(x,y) = \mathit{in}[\lfloor x+0.5 \rfloor, \lfloor y+0.5 \rfloor] \]

VPI_INTERP_LINEAR_PRECISE 

Precise linear interpolation.

Interpolation weights are defined as:

\begin{align*} w_0(t)& \triangleq t-\lfloor t \rfloor \\ w_1(t)& \triangleq 1 - w_0(t) \\ \end{align*}

Bilinear-interpolated value is given by the formula below:

\[ P(x,y) = \sum_{p=0}^1 \sum_{q=0}^1 \mathit{in}[\lfloor x \rfloor+p, \lfloor y \rfloor+q]w_p(x)w_q(y) \]

VPI_INTERP_LINEAR_FAST 

Fast linear interpolation.

Takes advantage of CUDA's backend hardware interpolator and take only 1 linear sample per pixel instead of 4, yielding better performance. There's a slight decrease in precision due to \(\alpha_x\) and \(\alpha_y\) being stored in 9-bit fixed point format (8 bits of fractional value), which amounts to ~1% maximum relative error.

It uses a technique presented here.

For other backends, it works exactly like VPI_INTERP_LINEAR_PRECISE.

VPI_INTERP_LINEAR 

Alias to fast linear interpolation.

It is to be used when user doesn't care if interpolator is fast or not.

VPI_INTERP_CATMULL_ROM_PRECISE 

Catmull-Rom cubic interpolation.

Catmull-Rom interpolation weights with \(A=-0.5\) are defined as follows:

\begin{eqnarray*} w_0(t) &\triangleq& A(t+1)^3 &-& 5A(t+1)^2 &+& 8A(t+1) &-& 4A \\ w_1(t) &\triangleq& (A+2)t^3 &-& (A+3)t^2 &\nonumber& &+& 1 \\ w_2(t) &\triangleq& (A+2)(1-t)^3 &-& (A+3)(1-t)^2 &\nonumber& &+& 1 \\ w_3(t) &\triangleq& \rlap{1 - w_0(t) - w_1(t) - w_2(t) } \end{eqnarray*}

Bicubic-interpolated value is given by the formula below:

\[ P(x,y) = \sum_{p=-1}^2 \sum_{q=-1}^2 \mathit{in}[\lfloor x \rfloor+p, \lfloor y \rfloor+q]w_p(x)w_q(y) \]

VPI_INTERP_CATMULL_ROM_FAST 

Fast Catmull-Rom cubic interpolation.

Takes advantage of CUDA's backend hardware interpolator and take 9 linear samples per pixel instead of 16, yielding better performance. There's a slight decrease in precision due to linear weights being stored in hardware in 9-bit fixed point format (8 bits of fractional value).

It uses a variation of the technique presented here. The paper assumes that \(w_1/(w_0+w_1), w_3/(w_2+w_3) \in [0,1]\), which doesn't hold for Catmull-Rom interpolator, but for \(w_2/(w_1+w_2)\) it does. So, in the 1D case, 3 texture fetches are performed: one linear fetch corresponding to \(w_1T_0 + w_2T_1\), and two more regular fetches for \(w_0T_{-1}\) and \(w_3T_2\). These 3 fetches in 1D correspond to 9 fetches in 2D.

For other backends, it works exactly like VPI_INTERP_CATMULL_ROM_PRECISE.

VPI_INTERP_CATMULL_ROM 

Alias to fast Catmull-Rom cubic interpolator.

It is to be used when user doesn't care if interpolator is fast or not.

Definition at line 242 of file Types.h.

◆ VPILockMode

#include <vpi/Types.h>

Defines the lock modes used by memory lock functions.

Enumerator
VPI_LOCK_READ 

Lock memory only for reading.

Writing to the memory when locking for reading leads to undefined behavior.

VPI_LOCK_WRITE 

Lock memory only for writing.

Reading to the memory when locking for reading leads to undefined behavior. It is expected that the whole memory is written to. If there are regions not written, it might not be updated correctly during unlock. In this case, it's better to use VPI_LOCK_READ_WRITE.

It might be slightly efficient to lock only for writing, specially when performing non-shared memory mapping.

VPI_LOCK_READ_WRITE 

Lock memory for reading and writing.

Definition at line 443 of file Types.h.

VPIHomographyTransform2D::mat3
float mat3[3][3]
3x3 homogeneous matrix that defines the homography.
Definition: Types.h:385
VPIBoundingBox::width
float width
Bounding box width.
Definition: Types.h:406
VPIBoundingBox::xform
VPIHomographyTransform2D xform
Defines the bounding box top left corner and its homography.
Definition: Types.h:405
VPIBoundingBox::height
float height
Bounding box height.
Definition: Types.h:407