irfft#
-
nvmath.
distributed. fft. irfft( - operand,
- distribution: Slab | Sequence[Box],
- sync_symmetric_memory: bool = True,
- options: FFTOptions | None = None,
- stream: AnyStream | None = None,
Perform an N-D complex-to-real (C2R) distributed FFT on the provided complex operand. The direction is implicitly inverse.
- Parameters:
operand –
A tensor (ndarray-like object). The currently supported types are
numpy.ndarray
,cupy.ndarray
, andtorch.Tensor
.Important
GPU operands must be on the symmetric heap (for example, allocated with
nvmath.
).distributed. allocate_symmetric_memory() distribution – Specifies the distribution of input and output operands across processes, which can be: (i) according to a Slab distribution (see
Slab
), or (ii) a custom box distribution. With Slab distribution, this indicates the distribution of the input operand (the output operand will use the complementary Slab distribution). With box distribution, this indicates the input and output boxes.sync_symmetric_memory – Indicates whether to issue a symmetric memory synchronization operation on the execute stream before the FFT. Note that before the FFT starts executing, it is required that the input operand be ready on all processes. A symmetric memory synchronization ensures completion and visibility by all processes of previously issued local stores to symmetric memory. Advanced users who choose to manage the synchronization on their own using the appropriate NVSHMEM API, or who know that GPUs are already synchronized on the source operand, can set this to False.
options – Specify options for the FFT as a
FFTOptions
object. Alternatively, adict
containing the parameters for theFFTOptions
constructor can also be provided. If not specified, the value will be set to the default-constructedFFTOptions
object.stream – Provide the CUDA stream to use for executing the operation. Acceptable inputs include
cudaStream_t
(as Pythonint
),cupy.cuda.Stream
, andtorch.cuda.Stream
. If a stream is not provided, the current stream from the operand package will be used.
- Returns:
A real tensor whose shape will depend on the choice of distribution and reshape option. The operand remains on the same device and belongs to the same package as the input operand. The global extent of the last transformed axis in the result will be
(global_extent[-1] - 1) * 2
ifFFTOptions.last_axis_parity
iseven
, orglobal_extent[-1] * 2 - 1
ifFFTOptions.last_axis_parity
isodd
.
Example
>>> import cupy as cp >>> import nvmath.distributed
Get MPI communicator used to initialize nvmath.distributed (for information on initializing nvmath.distributed, you can refer to the documentation or to the FFT examples in nvmath/examples/distributed/fft):
>>> comm = nvmath.distributed.get_context().communicator >>> nranks = comm.Get_size() >>> from nvmath.distributed.fft import Slab
Create a 3-D symmetric complex128 ndarray on GPU symmetric memory:
>>> shape = 512 // nranks, 768, 256 >>> a = nvmath.distributed.allocate_operand( ... shape, cp, input_dtype=cp.float64, distribution=Slab.X, fft_type="R2C" ... ) >>> a[:] = cp.random.rand(*shape, dtype=cp.float64) >>> b = nvmath.distributed.fft.rfft(a, distribution=Slab.X)
Perform a 3-D C2R FFT using the
irfft()
wrapper. The resultr
is a CuPy float64 ndarray:>>> r = nvmath.distributed.fft.irfft(b, distribution=Slab.X) >>> r.dtype dtype('float64')
Notes
This function performs an inverse C2R N-D FFT, which is similar to
irfftn
but different fromirfft
in various numerical packages.This function is a convenience wrapper around
FFT
and is specifically meant for single use. The same computation can be performed with the stateful API by settingFFTOptions.fft_type
to'C2R'
and passing the argumentdirection='inverse'
when callingFFT.execute()
.The input to this function must be Hermitian-symmetric, otherwise the result is undefined. While the symmetry requirement is partially captured by the different global extents in the last transformed dimension between the input and result, there are additional constraints. In addition, if the input to
irfft
was generated using an R2C FFT with an odd global last axis size,FFTOptions.last_axis_parity
must be set toodd
to recover the original signal.For more details, please refer to R2C/C2R example and odd C2R example.