VPI - Vision Programming Interface

0.2.0 Release

Image Inverse FFT

Overview

Image Inverse FFT implements the inverse Fourier Transform for 2D images, supporting real- and complex-valued outputs. Given a 2D spectrum (frequency domain), it returns the image representation on the spatial domain. It is the exact inverse of Image FFT algorithm.

Input as a magnitude spectrum Output in spatial domain

Implementation

The inverse Fast Fourier Transform follows closely forward FFT's implementation, except for the sign of the exponent, which is positive here.

\[ I[m,n] = \frac{1}{MN} \sum^{M-1}_{u=0} \sum^{N-1}_{v=0} I'[u,v] e^{+2\pi i (\frac{um}{M}+\frac{vn}{N})} \]

Where:

  • \(I'\) is the input image in frequency domain
  • \(I\) is its spatial domain representation
  • \(M\times N\) is input's dimensions

The normalization factor \(\frac{1}{MN}\) is applied by default to make IFFT an exact inverse of FFT. As this incurs in a performance hit and normalization might not be needed, user can pass the flag VPI_IFFT_DENORMALIZED to vpiSubmitImageIFFT to indicate that output must be left denormalized.

As with direct FFT, depending on \(N\), different techniques are employed for best performance:

  • CPU backend
    • Fast paths when \(N\) can be factored into \(2^a \times 3^b \times 5^c\)
  • CUDA backend
    • Fast paths when \(N\) can be factored into \(2^a \times 3^b \times 5^c \times 7^d\)

In general the smaller the prime factor is, the better the performance, i.e., powers of two are fastest.

Image IFFT supports the following transform types:

  • complex-to-complex, C2C
  • complex-to-real, C2R.

Data Layout

Data layout depends strictly on the transform type. In case of general C2C transform, both input and output data shall be of type VPI_IMAGE_TYPE_2F32 and have the same size. In C2R mode, each input image row \((X_1,X_2,\dots,X_{\lfloor\frac{N}{2}\rfloor+1})\) of type VPI_IMAGE_TYPE_2F32 containing only non-redundant values results in row \((x_1,x_2,\dots,x_N)\) with type VPI_IMAGE_TYPE_F32 representing the whole image. In both cases, input and output image heights are the same.

Usage

  1. Initialization phase
    1. Include the header that defines the inverse FFT functions.
      #include <vpi/algo/ImageIFFT.h>
    2. Define the stream on which the algorithm will be executed and the input spectrum image. Since we're performing a C2R IFFT, the input width must be \(\lfloor \frac{w}{2} \rfloor+1\), where \(w\) is the output image width. Input and output heights must match.
      VPIStream stream = /*...*/;
      VPIImage input = /*...*/;
    3. Create the output image. Again, because of C2F IFFT, the output type must be VPI_IMAGE_TYPE_F32. We're passing 0 as flags to make the function return a normalized output.
      VPIImage output;
      vpiImageCreate(w, h, VPI_IMAGE_TYPE_F32, 0, &output);
    4. Create the ImageIFFT payload. Input is complex and output is real. Problem size refers to output size.
  2. Processing phase
    1. Submit the IFFT algorithm to the stream, passing the input spectrum and the output buffer.
      vpiSubmitImageIFFT(ifft, input, output, 0);
    2. Wait until the processing is done.
      vpiStreamSync(stream);
    3. Now display output using your preferred method.

Limitations and Constraints

Constraints for specific backends supersede the ones specified for all backends.

All Backends

CPU

  • If output has type VPI_IMAGE_TYPE_F32, its width must be even.
  • Input and output rows must be aligned to 4 bytes.

CUDA

  • There can be memory allocation in first call to vpiSubmitImageIFFT and every time the input or output row stride changes with respect to previous call.

PVA

  • Not implemented.

Performance

For further information on how performance was benchmarked, see Performance Measurement.

Jetson AGX Xavier
sizetypenorm.CPUCUDAPVA
1920x1080C2Ryes 6.7 ms0.9815 msn/a
1920x1080C2Rno 6.5 ms0.8172 msn/a
1920x1080C2Cyes 20.9 ms1.5312 msn/a
1920x1080C2Cno 19.09 ms1.2582 msn/a
1024x1024C2Ryes 5.08 ms0.2876 msn/a
1024x1024C2Rno 4.8 ms0.1979 msn/a
1024x1024C2Cyes 7.8 ms0.6044 msn/a
1024x1024C2Cno 6.10 ms0.4698 msn/a
626x626C2Ryes 34.0 ms0.470 msn/a
626x626C2Rno 33.9 ms0.4319 msn/a
626x626C2Cyes 66.60 ms0.833 msn/a
626x626C2Cno 66.2 ms0.7772 msn/a
Jetson TX2
sizetypenorm.CPUCUDAPVA
1920x1080C2Ryes 17.02 ms3.04 msn/a
1920x1080C2Rno 16.87 ms2.47 msn/a
1920x1080C2Cyes 49.4 ms5.07 msn/a
1920x1080C2Cno 47.4 ms4.18 msn/a
1024x1024C2Ryes 12.14 ms1.18 msn/a
1024x1024C2Rno 11.90 ms0.909 msn/a
1024x1024C2Cyes 41.08 ms1.91 msn/a
1024x1024C2Cno 39.99 ms1.52 msn/a
626x626C2Ryes 72.67 ms1.562 msn/a
626x626C2Rno 72.9 ms1.453 msn/a
626x626C2Cyes 146.0 ms3.05 msn/a
626x626C2Cno 144.82 ms2.92 msn/a
Jetson Nano
sizetypenorm.CPUCUDAPVA
1920x1080C2Ryes 34.1 ms6.95 msn/a
1920x1080C2Rno 34.0 ms5.712 msn/a
1920x1080C2Cyes 86.3 ms12.28 msn/a
1920x1080C2Cno 87.2 ms10.46 msn/a
1024x1024C2Ryes 21.3 ms2.480 msn/a
1024x1024C2Rno 21.11 ms1.897 msn/a
1024x1024C2Cyes 69.8 ms4.248 msn/a
1024x1024C2Cno 64.8 ms3.196 msn/a
626x626C2Ryes 171.7 ms3.596 msn/a
626x626C2Rno 172 ms3.354 msn/a
626x626C2Cyes 342 ms6.662 msn/a
626x626C2Cno 341 ms6.262 msn/a
vpiStreamSync
VPIStatus vpiStreamSync(VPIStream stream)
Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
vpiSubmitImageIFFT
VPIStatus vpiSubmitImageIFFT(VPIPayload payload, VPIImage input, VPIImage output, uint32_t flags)
Runs IFFT on single image.
VPIImage
struct VPIImageImpl * VPIImage
Definition: Types.h:170
VPI_IMAGE_TYPE_F32
@ VPI_IMAGE_TYPE_F32
1 channel of 32-bit float.
Definition: Types.h:199
vpiImageCreate
VPIStatus vpiImageCreate(uint32_t width, uint32_t height, VPIImageType type, uint32_t flags, VPIImage *img)
Create an empty image instance with the specified flags.
VPI_IMAGE_TYPE_2F32
@ VPI_IMAGE_TYPE_2F32
2 interleaved channels of 32-bit floats.
Definition: Types.h:200
VPIPayload
struct VPIPayloadImpl * VPIPayload
A handle to an algorithm payload.
Definition: Types.h:181
vpiCreateImageIFFT
VPIStatus vpiCreateImageIFFT(VPIStream stream, uint32_t outputWidth, uint32_t outputHeight, const VPIImageType inType, const VPIImageType outType, VPIPayload *payload)
Creates payload for vpiSubmitImageIFFT.
VPIStream
struct VPIStreamImpl * VPIStream
Definition: Types.h:164