compile_prolog#
-
nvmath.
fft. compile_prolog( - prolog_fn,
- element_dtype,
- user_info_dtype,
- *,
- compute_capability=None,
Compile a Python function to LTO-IR to provide as a prolog function for
fft()andplan().- Parameters:
prolog_fn – The prolog function to be compiled to LTO-IR. It must have the signature:
prolog_fn(data_in, offset, user_info, reserved_for_future_use), and it essentially returns transformeddata_inatoffset.element_dtype – The data type of the
data_inargument, one of['float32', 'float64', 'complex64', 'complex128']. It must have the same data type as that of the FFT operand for prolog functions or the FFT result for epilog functions.user_info_dtype –
The data type of the
user_infoargument. It must be one of['float32', 'float64', 'complex64', 'complex128']or an object of typenumba.types.Type. The offset is computed based on the memory layout (shape and strides) of the operand (input for prolog, output for epilog). If the user would like to pass additional tensor asuser_infoand access it based on the offset, it is crucial to know memory layout of the operand. Please note, the actual layout of the input tensor may differ from the layout of the tensor passed to fft call. To learn the memory layout of the input or output, please use stateful FFT API andnvmath.fft. FFT. get_input_layout() nvmath.respectively.fft. FFT. get_output_layout() Note
Currently, in the callback, the position of the element in the input and output operands are described with a single flat offset, even if the original operand is multi-dimensional tensor.
compute_capability – The target compute capability, specified as a string (
'80','89', …). The default is the compute capability of the current device.
- Returns:
The function compiled to LTO-IR as
bytesobject.
See also
Notes
The user must ensure that the specified argument types meet the requirements listed above.