compile_prolog#
-
nvmath.
fft. compile_prolog( - prolog_fn,
- element_dtype,
- user_info_dtype,
- *,
- compute_capability=None,
Compile a Python function to LTO-IR to provide as a prolog function for
fft()
andplan()
.- Parameters:
prolog_fn – The prolog function to be compiled to LTO-IR. It must have the signature:
prolog_fn(data_in, offset, user_info, reserved_for_future_use)
, and it essentially returns transformeddata_in
atoffset
.element_dtype – The data type of the
data_in
argument, one of['float32', 'float64', 'complex64', 'complex128']
. It must have the same data type as that of the FFT operand for prolog functions or the FFT result for epilog functions.user_info_dtype –
The data type of the
user_info
argument. It must be one of['float32', 'float64', 'complex64', 'complex128']
or an object of typenumba.types.Type
. The offset is computed based on the memory layout (shape and strides) of the operand (input for prolog, output for epilog). If the user would like to pass additional tensor asuser_info
and access it based on the offset, it is crucial to know memory layout of the operand. Please note, the actual layout of the input tensor may differ from the layout of the tensor passed to fft call. To learn the memory layout of the input or output, please use stateful FFT API andnvmath.
fft. FFT. get_input_layout() nvmath.
respectively.fft. FFT. get_output_layout() Note
Currently, in the callback, the position of the element in the input and output operands are described with a single flat offset, even if the original operand is multi-dimensional tensor.
compute_capability – The target compute capability, specified as a string (
'80'
,'89'
, …). The default is the compute capability of the current device.
- Returns:
The function compiled to LTO-IR as
bytes
object.
See also
Notes
The user must ensure that the specified argument types meet the requirements listed above.