allocate_operand#
-
nvmath.
distributed. fft. allocate_operand( - shape: Sequence[int],
- package: ModuleType,
- *,
- input_dtype=None,
- distribution: Slab | Sequence[Sequence[Sequence[int]]],
- memory_space: Literal['cpu', 'cuda'] | None = None,
- fft_type: Literal['C2C', 'C2R', 'R2C'] | None = None,
- logger: Logger | None = None,
Return uninitialized operand of the given shape and type, to use as input for distributed FFT. The resulting tensor is backed by a buffer large enough for the specified FFT (the buffer can hold both the input and output -distributed FFT is inplace-, accounting for both the input and output distribution). For CUDA memory space, the tensor is allocated on the symmetric heap, on the device on which nvmath.distributed was initialized. This is a collective operation and must be called by all processes.
- Parameters:
shape – Shape of the tensor to allocate.
package – Python package determining the tensor type (e.g. numpy, cupy, torch).
input_dtype – Tensor dtype in a form recognized by the package. If None, will use the package’s default dtype.
distribution – Specifies the distribution of input and output operands across processes, which can be: (i) according to a Slab distribution (see
Slab
), or (ii) a custom box distribution. With Slab distribution, this indicates the distribution of the input operand (the output operand will use the complementary Slab distribution). With box distribution, this indicates the input and output boxes.memory_space – The memory space (
'cpu'
or'cuda'
) on which to allocate the tensor. If not provided, this is inferred for packages that support a single memory space like numpy and cupy. For other packages it must be provided.fft_type – The type of FFT to perform. Available options include
'C2C'
,'C2R'
, and'R2C'
. The default is'C2C'
for complex input and'R2C'
for real input.logger (logging.Logger) – Python Logger object. The root logger will be used if a logger object is not provided.