FFTOptions#

class nvmath.device.FFTOptions( size, precision, fft_type, code_type, execution, *, direction=None, ffts_per_block=None, elements_per_thread=None, real_fft_options=None, )[source]#

A class that encapsulates a partial FFT device function. A partial device function can be queried for available or optimal values for some knobs (such as ffts_per_block or elements_per_thread). It does not contain a compiled, ready-to-use, device function until finalized using create().

Parameters:

size (int) – The size of the FFT to calculate.
precision (str) – The computation precision specified as a numpy float dtype, currently supports numpy.float16, numpy.float32 and numpy.float64.
fft_type (str) – A string specifying the type of FFT operation, can be 'c2c', 'c2r' or 'r2c'.
code_type (CodeType) – The target GPU code and compute-capability.
execution (str) – A string specifying the execution method, can be 'Block' or 'Thread'.
direction (str) – A string specifying the direction of FFT, can be 'forward' or 'inverse'. If not provided, will be 'forward' if complex-to-real FFT is specified and 'inverse' if real-to-complex FFT is specified.
ffts_per_block (int) – The number of FFTs calculated per CUDA block, optional. The default is 1. Alternatively, if provided as 'suggested' will be set to a suggested value
elements_per_thread (int) – The number of elements per thread, optional. The default is 1. Alternatively, if provided as 'suggested', will be set to a suggested value.
real_fft_options (dict) –
A dictionary specifying the options for real FFT operation, optional. User may specify the following options in the dictionary:
- 'complex_layout', currently supports 'natural', 'packed', and 'full'.
- 'real_mode', currently supports 'normal' and 'folded.
execute_api –

Changed in version 0.5.0: execute_api is not part of the FFT type. Pass this argument to nvmath.device.fft() instead.

Note

The class is not meant to be used directly with its constructor. Users are instead advised to use fft() to create the object.