Common Types

Defines common types used by several components. More...

## Data Structures | |

struct | VPIKeypoint |

Stores a keypoint coordinate. More... | |

struct | VPIHomographyTransform2D |

Stores a generic 2D homography transform. More... | |

struct | VPIBoundingBox |

Stores a generic 2D bounding box. More... | |

## Enumerations | |

enum | VPIBoundaryCond |

Image boundary condition specify how pixel values outside of the image domain should be constructed. More... | |

enum | VPIInterpolationType |

Interpolation types supported by several algorithms. More... | |

enum | VPILockMode |

Defines the lock modes used by memory lock functions. More... | |

Defines common types used by several components.

struct VPIKeypoint |

struct VPIHomographyTransform2D |

Stores a generic 2D homography transform.

When only scale and translation transformation is needed, these parameters must be arranged in the matrix as follows:

\[ \begin{bmatrix} s_x & 0 & p_x \\ 0 & s_y & p_y \\ 0 & 0 & 1 \end{bmatrix} \]

Scaling \((s_x,s_y)\) is relative to the center of the patch, position \((p_x,p_y)\) is relative to the top-left of the image.

In the general case, given an homogeneous 2D point \(P(x,y,1)\) and the matrix \(M^{3x3}\), the Euclidean 2D point \(O(x,y)\) is defined as

\begin{align} T &= M \cdot P \\ O &= (T_x/T_z, T_y/T_z) \end{align}

Collaboration diagram for VPIHomographyTransform2D:

Data Fields | ||
---|---|---|

float | mat3[3][3] | 3x3 homogeneous matrix that defines the homography. |

struct VPIBoundingBox |

Stores a generic 2D bounding box.

Although this structure can store a 2D bounding box transformed by any homography, most of the time it stores an axis-aligned bounding box. To retrieve it, do the following:

Collaboration diagram for VPIBoundingBox:

Data Fields | ||
---|---|---|

float | height | Bounding box height. |

float | width | Bounding box width. |

VPIHomographyTransform2D | xform | Defines the bounding box top left corner and its homography. |

enum VPIBoundaryCond |

`#include <vpi/Types.h>`

Image boundary condition specify how pixel values outside of the image domain should be constructed.

Enumerator | |
---|---|

VPI_BOUNDARY_COND_ZERO | All pixels outside the image are considered to be zero. |

VPI_BOUNDARY_COND_CLAMP | Border pixels are repeated indefinitely. |

enum VPIInterpolationType |

`#include <vpi/Types.h>`

Interpolation types supported by several algorithms.

Enumerator | |
---|---|

VPI_INTERP_NEAREST | Nearest neighbor interpolation. \[ P(x,y) = \mathit{in}[\lfloor x+0.5 \rfloor, \lfloor y+0.5 \rfloor] \] |

VPI_INTERP_LINEAR_PRECISE | Precise linear interpolation. Interpolation weights are defined as: \begin{align*} w_0(t)& \triangleq t-\lfloor t \rfloor \\ w_1(t)& \triangleq 1 - w_0(t) \\ \end{align*} Bilinear-interpolated value is given by the formula below: \[ P(x,y) = \sum_{p=0}^1 \sum_{q=0}^1 \mathit{in}[\lfloor x \rfloor+p, \lfloor y \rfloor+q]w_p(x)w_q(y) \] |

VPI_INTERP_LINEAR_FAST | Fast linear interpolation. Takes advantage of CUDA's backend hardware interpolator and take only 1 linear sample per pixel instead of 4, yielding better performance. There's a slight decrease in precision due to \(\alpha_x\) and \(\alpha_y\) being stored in 9-bit fixed point format (8 bits of fractional value), which amounts to ~1% maximum relative error. It uses a technique presented here. For other backends, it works exactly like VPI_INTERP_LINEAR_PRECISE. |

VPI_INTERP_LINEAR | Alias to fast linear interpolation. It is to be used when user doesn't care if interpolator is fast or not. |

VPI_INTERP_CATMULL_ROM_PRECISE | Catmull-Rom cubic interpolation. Catmull-Rom interpolation weights with \(A=-0.5\) are defined as follows: \begin{eqnarray*} w_0(t) &\triangleq& A(t+1)^3 &-& 5A(t+1)^2 &+& 8A(t+1) &-& 4A \\ w_1(t) &\triangleq& (A+2)t^3 &-& (A+3)t^2 &\nonumber& &+& 1 \\ w_2(t) &\triangleq& (A+2)(1-t)^3 &-& (A+3)(1-t)^2 &\nonumber& &+& 1 \\ w_3(t) &\triangleq& \rlap{1 - w_0(t) - w_1(t) - w_2(t) } \end{eqnarray*} Bicubic-interpolated value is given by the formula below: \[ P(x,y) = \sum_{p=-1}^2 \sum_{q=-1}^2 \mathit{in}[\lfloor x \rfloor+p, \lfloor y \rfloor+q]w_p(x)w_q(y) \] |

VPI_INTERP_CATMULL_ROM_FAST | Fast Catmull-Rom cubic interpolation. Takes advantage of CUDA's backend hardware interpolator and take 9 linear samples per pixel instead of 16, yielding better performance. There's a slight decrease in precision due to linear weights being stored in hardware in 9-bit fixed point format (8 bits of fractional value). It uses a variation of the technique presented here. The paper assumes that \(w_1/(w_0+w_1), w_3/(w_2+w_3) \in [0,1]\), which doesn't hold for Catmull-Rom interpolator, but for \(w_2/(w_1+w_2)\) it does. So, in the 1D case, 3 texture fetches are performed: one linear fetch corresponding to \(w_1T_0 + w_2T_1\), and two more regular fetches for \(w_0T_{-1}\) and \(w_3T_2\). These 3 fetches in 1D correspond to 9 fetches in 2D. For other backends, it works exactly like VPI_INTERP_CATMULL_ROM_PRECISE. |

VPI_INTERP_CATMULL_ROM | Alias to fast Catmull-Rom cubic interpolator. It is to be used when user doesn't care if interpolator is fast or not. |

enum VPILockMode |

`#include <vpi/Types.h>`

Defines the lock modes used by memory lock functions.

VPIHomographyTransform2D xform

Defines the bounding box top left corner and its homography.