Inverse FFT

*Inverse FFT* implements the inverse Fourier Transform for 2D images, supporting real- and complex-valued outputs. Given a 2D spectrum (frequency domain), it returns the image representation on the spatial domain. It is the exact inverse of FFT algorithm.

Input as a magnitude spectrum | Output in spatial domain |
---|---|

The inverse Fast Fourier Transform follows closely forward FFT's implementation, except for the sign of the exponent, which is positive here.

\[ I[m,n] = \frac{1}{MN} \sum^{M-1}_{u=0} \sum^{N-1}_{v=0} I'[u,v] e^{+2\pi i (\frac{um}{M}+\frac{vn}{N})} \]

Where:

- \(I'\) is the input image in frequency domain
- \(I\) is its spatial domain representation
- \(M\times N\) is input's dimensions

The normalization factor \(\frac{1}{MN}\) is applied by default to make IFFT an exact inverse of FFT. As this incurs in a performance hit and normalization might not be needed, user can pass the flag VPI_DENORMALIZED_OUTPUT to vpiSubmitIFFT to indicate that output must be left denormalized.

As with direct FFT, depending on \(N\), different techniques are employed for best performance:

- CPU backend
- Fast paths when \(N\) can be factored into \(2^a \times 3^b \times 5^c\)

- CUDA backend
- Fast paths when \(N\) can be factored into \(2^a \times 3^b \times 5^c \times 7^d\)

In general the smaller the prime factor is, the better the performance, i.e., powers of two are fastest.

IFFT supports the following transform types:

- complex-to-complex, C2C
- complex-to-real, C2R.

Data layout depends strictly on the transform type. In case of general C2C transform, both input and output data shall be of type VPI_IMAGE_FORMAT_2F32 and have the same size. In C2R mode, each input image row \((X_1,X_2,\dots,X_{\lfloor\frac{N}{2}\rfloor+1})\) of type VPI_IMAGE_FORMAT_2F32 containing only non-redundant values results in row \((x_1,x_2,\dots,x_N)\) with type VPI_IMAGE_FORMAT_F32 representing the whole image. In both cases, input and output image heights are the same.

- Note
- There can be memory allocation in first call to vpiSubmitIFFT and every time the input or output row stride changes with respect to previous call.

For list of limitations, constraints and backends that implements the algorithm, consult reference documentation of the following functions:

Function | Description |
---|---|

vpiCreateIFFT | Creates payload for inverse Fast Fourier Transform algorithm. |

vpiSubmitIFFT | Runs the inverse Fast Fourier Transform on single image. |

- Attention
- This algorithm requires that the library libcufft.so.10 be installed in the system.

- Import VPI module import vpi
- Run the C2R IFFT on a VPI image input with spectrum data in
`vpi.Format._2F32`

format. The resulting VPI image has`vpi.Format.F32`

format. Operation is executed by the CUDA backend.with vpi.Backend.CUDA:output = input.irfft()

- Initialization phase
- Include the header that defines the inverse FFT functions. #include <vpi/algo/FFT.h>Declares functions that implement the Fast Fourier Transform algorithm and its inverse.
- Define the input spectrum image. Since we're performing a C2R IFFT, the input width must be \(\lfloor \frac{w}{2} \rfloor+1\), where \(w\) is the output image width. Input and output heights must match. Image format must be VPI_IMAGE_FORMAT_2F32. VPIImage input = /*...*/;
- Create the output image. Again, because of C2F IFFT, the output type must be VPI_IMAGE_FORMAT_F32. We're passing 0 as flags to make the function return a normalized output. VPIImage output;vpiImageCreate(w, h, VPI_IMAGE_FORMAT_F32, 0, &output);#define VPI_IMAGE_FORMAT_F32Single plane with one 32-bit floating point channel.
**Definition:**ImageFormat.h:136VPIStatus vpiImageCreate(int32_t width, int32_t height, VPIImageFormat fmt, uint64_t flags, VPIImage *img)Create an empty image instance with the specified flags. - Create the IFFT payload on the CUDA backend. Input is complex and output is real. Problem size refers to output size. VPIPayload ifft;#define VPI_IMAGE_FORMAT_2F32Single plane with two interleaved 32-bit floating point channels.
**Definition:**ImageFormat.h:142VPIStatus vpiCreateIFFT(uint64_t backends, int32_t outputWidth, int32_t outputHeight, const VPIImageFormat inFormat, const VPIImageFormat outFormat, VPIPayload *payload)Creates payload for inverse Fast Fourier Transform algorithm. - Create the stream where the algorithm will be submitted for execution. VPIStream stream;vpiStreamCreate(0, &stream);VPIStatus vpiStreamCreate(uint64_t flags, VPIStream *stream)Create a stream instance.

- Include the header that defines the inverse FFT functions.
- Processing phase
- Submit the IFFT algorithm to the stream, passing the input spectrum and the output buffer. It'll be executed on the CUDA backend, since the payload is created there. vpiSubmitIFFT(stream, VPI_BACKEND_CUDA, ifft, input, output, 0);VPIStatus vpiSubmitIFFT(VPIStream stream, uint64_t backend, VPIPayload payload, VPIImage input, VPIImage output, uint64_t flags)Runs the inverse Fast Fourier Transform on single image.
- Wait until the processing is done. vpiStreamSync(stream);VPIStatus vpiStreamSync(VPIStream stream)Blocks the calling thread until all submitted commands in this stream queue are done (queue is empty)...
- Now display
*output*using your preferred method.

- Submit the IFFT algorithm to the stream, passing the input spectrum and the output buffer. It'll be executed on the CUDA backend, since the payload is created there.
- Cleanup phase
- Free resources held by the stream, the payload and the input and output images. vpiStreamDestroy(stream);vpiPayloadDestroy(ifft);vpiImageDestroy(input);vpiImageDestroy(output);void vpiPayloadDestroy(VPIPayload payload)Deallocates the payload object and all associated resources.void vpiStreamDestroy(VPIStream stream)Destroy a stream instance and deallocate all HW resources.

- Free resources held by the stream, the payload and the input and output images.

For more information, see Fast Fourier Transform in the "C API Reference" section of *VPI - Vision Programming Interface*.

For information on how to use the performance table below, see Algorithm Performance Tables.

Before comparing measurements, consult Comparing Algorithm Elapsed Times.

For further information on how performance was benchmarked, see Performance Benchmark.