Overview

Recursive Gaussian Filter is a low-pass IIR (infinite impulse response) Gaussian filter that smooths out the image by doing a series interconnection of low-order causal and anti-causal recursive filters. It approximates well wide support Gaussian filters, i.e. the ones with bigger standard deviations (sigma), while providing superior performance than direct-convolution Gaussian Filter. It has the advantage of filtering images at the same computing time no matter how large the sigma, as it does not have nor depend on the kernel support size.

It supports only one mode of operation:

User provides filter standard deviation (sigma).

The image below shows an usage example. The entire input image is blurred with a big sigma using Recursive Gaussian Filter to serve as background, while a mask is used to segment out and preserve the heron in the middle of it.

Input and mask	Gaussian sigma	Output
	\[ \sigma=17 \]

Implementation

Recursive Gaussian filter is implemented as a sequence of forward-direction (causal) filter and back-direction (anti-causal) filter on rows and columns of an input image. To better understand them, let us first define linear time-invariant (LTI) filter as:

\[ \sum_{i=-r}^{r} a_i z[k-i] = \sum_{i=-s}^{s} b_i w[k-i] \]

The \(a_i\) and \(b_i\) are filter design parameters. The \(w[k]\) and \(z[k]\) are input and output signals, respectively. This LTI filter can be decomposed into a convolution pass and a causal and anti-causal combination of recursive filter passes:

\[ x[k] = \sum_{i=-s}^{s} c_i w[k-i] \]

\[ y[k] = x[k] - \sum_{i=1}^{r} d_i y[k-i] \]

\[ z[k] = y[k] - \sum_{i=1}^{r} e_i z[k-i] \]

The convolution has a finite impulse response (FIR) given by the \(c_i\) coefficients, and \(s\) is the convolution kernel support. Moreover, the recursive filters have an infinite impulse response (IIR) given by the \(d_i\) and \(e_i\) coefficients, and \(r\) is the filter order. The \(x[k]\) and \(y[k]\) are intermediary output signals of the FIR and IIR, respectively. This process can be extended from 1D signals to 2D images by independently filtering all columns and then all resulting rows.

Instead of direct convolution, a third-order recursive filter can be used to approximate Gaussian filtering, operating in linear time independent of the standard deviation. They are the best alternative in terms of performance, quality and simplicity, especially for wide support filters. In the example above, a direct convolution kernel of \(101x101\) in support size is needed to accomodate the specified \(sigma=17\) and obtain the same blur effect.

The implementation overlaps causal-anticausal processing with row-column processing. Overlapping works similar as fusion, but in a deeper, intra-stage algorithmic level. For more information, see [1].

C API functions

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function	Description
vpiCreateRecursiveGaussianFilter	Creates payload for vpiSubmitRecursiveGaussianFilter.
vpiSubmitRecursiveGaussianFilter	Runs a Recursive Gaussian Filter over an image.

Usage

Language: C/C++ Python

Import VPI module
import vpi
Use Recursive Gaussian Filter to blur the input image with \(\sigma=17\), using REFLECT boundary condition in the CUDA backend. Input and output are VPI images.
with vpi.Backend.CUDA:

output = input.recursive_gaussian_filter(17, border=vpi.Border.REFLECT)

Initialization phase
1. Include the header that defines the Recursive Gaussian filter function.
  #include <vpi/algo/RecursiveGaussianFilter.h>
  
  RecursiveGaussianFilter.h
  Declares functions that implement the Recursive Gaussian Filter algorithm.
2. Define the input image object.
  VPIImage input = /*...*/;
  
  VPIImage
  struct VPIImageImpl * VPIImage
  A handle to an image.
  Definition: Types.h:256
3. Create the output image. It gets its dimensions and format from the input image.
  int32_t w, h;
  
  vpiImageGetSize(input, &w, &h);
  
  VPIImageFormat type;
  
  vpiImageGetFormat(input, &type);
  
  VPIImage output;
  
  vpiImageCreate(w, h, type, 0, &output);
  
  VPIImageFormat
  uint64_t VPIImageFormat
  Pre-defined image formats.
  Definition: ImageFormat.h:94
  
  vpiImageGetFormat
  VPIStatus vpiImageGetFormat(VPIImage img, VPIImageFormat *format)
  Get the image format.
  
  vpiImageCreate
  VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)
  Create an empty image instance with the specified flags.
  
  vpiImageGetSize
  VPIStatus vpiImageGetSize(VPIImage img, int32_t *width, int32_t *height)
  Get the image dimensions in pixels.
4. Create the stream where the algorithm will be submitted for execution.
  VPIStream stream;
  
  vpiStreamCreate(0, &stream);
  
  VPIStream
  struct VPIStreamImpl * VPIStream
  A handle to a stream.
  Definition: Types.h:250
  
  vpiStreamCreate
  VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)
  Create a stream instance.
5. Create the Recursive Gaussian Filter payload. The width and height are the maximum image size that can be used by the submit Recursive Gaussian Filter.
  VPIPayload payload;
  
  vpiCreateRecursiveGaussianFilter(VPI_BACKEND_CUDA, w, h, &payload);
  
  VPIPayload
  struct VPIPayloadImpl * VPIPayload
  A handle to an algorithm payload.
  Definition: Types.h:268
  
  vpiCreateRecursiveGaussianFilter
  VPIStatus vpiCreateRecursiveGaussianFilter(uint64_t backends, int32_t maxWidth, int32_t maxHeight, VPIPayload *payload)
  Creates payload for vpiSubmitRecursiveGaussianFilter.
  
  VPI_BACKEND_CUDA
  @ VPI_BACKEND_CUDA
  CUDA backend.
  Definition: Types.h:93
Processing phase
1. Submit the Recursive Gaussian Filter algorithm to the stream along with other parameters. It will be executed by the CUDA backend. It defines a Gaussian filter with \(\sigma=17\), along with reflect-border extension.
  vpiSubmitRecursiveGaussianFilter(stream, VPI_BACKEND_CUDA, payload, input, output, 17, 17, VPI_BORDER_REFLECT);
  
  vpiSubmitRecursiveGaussianFilter
  VPIStatus vpiSubmitRecursiveGaussianFilter(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage input, VPIImage output, float sigmaX, float sigmaY, VPIBorderExtension border)
  Runs a Recursive Gaussian Filter over an image.
  
  VPI_BORDER_REFLECT
  @ VPI_BORDER_REFLECT
  edcba|abcde|edcba
  Definition: Types.h:280
2. Optionally, wait until the processing is done.
  vpiStreamSync(stream);
  
  vpiStreamSync
  VPIStatus vpiStreamSync(VPIStream stream)
  Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
Cleanup phase
1. Free resources held by the stream, input and output images and the algorithm payload.
  vpiStreamDestroy(stream);
  
  vpiImageDestroy(input);
  
  vpiImageDestroy(output);
  
  vpiPayloadDestroy(payload);
  
  vpiImageDestroy
  void vpiImageDestroy(VPIImage img)
  Destroy an image instance.
  
  vpiPayloadDestroy
  void vpiPayloadDestroy(VPIPayload payload)
  Deallocates the payload object and all associated resources.
  
  vpiStreamDestroy
  void vpiStreamDestroy(VPIStream stream)
  Destroy a stream instance and deallocate all HW resources.

For more information, see Recursive Gaussian Filter in the "C API Reference" section of VPI - Vision Programming Interface.

Performance

For information on how to use the performance table below, see Algorithm Performance Tables.
Before comparing measurements, consult Comparing Algorithm Elapsed Times.
For further information on how performance was benchmarked, see Performance Benchmark.

References

D. Nehab and A. Maximo, "Parallel recursive filtering of infinite input extensions."
ACM Transactions on Graphics, volume 35, number 6, article 204, 2016.