compile_prolog#

nvmath.fft.compile_prolog(
prolog_fn,
element_dtype,
user_info_dtype,
*,
compute_capability=None,
)[source]#

Compile a Python function to LTO-IR to provide as a prolog function for fft() and plan().

Parameters:
  • prolog_fn – The prolog function to be compiled to LTO-IR. It must have the signature: prolog_fn(data_in, offset, user_info, reserved_for_future_use), and it essentially returns transformed data_in at offset.

  • element_dtype – The data type of the data_in argument, one of ['float32', 'float64', 'complex64', 'complex128']. It must have the same data type as that of the FFT operand for prolog functions or the FFT result for epilog functions.

  • user_info_dtype

    The data type of the user_info argument. It must be one of ['float32', 'float64', 'complex64', 'complex128'] or an object of type numba.types.Type. The offset is computed based on the memory layout (shape and strides) of the operand (input for prolog, output for epilog). If the user would like to pass additional tensor as user_info and access it based on the offset, it is crucial to know memory layout of the operand. Please note, the actual layout of the input tensor may differ from the layout of the tensor passed to fft call. To learn the memory layout of the input or output, please use stateful FFT API and nvmath.fft.FFT.get_input_layout() nvmath.fft.FFT.get_output_layout() respectively.

    Note

    Currently, in the callback, the position of the element in the input and output operands are described with a single flat offset, even if the original operand is multi-dimensional tensor.

  • compute_capability – The target compute capability, specified as a string ('80', '89', …). The default is the compute capability of the current device.

Returns:

The function compiled to LTO-IR as bytes object.

Notes

  • The user must ensure that the specified argument types meet the requirements listed above.