Other Methods#
Frontend to Backend Traits Conversion#
cufftdx::utils::frontend_to_backend(...) converts a frontend FFT description to a effective backend traits that
is used to query the cuFFTDx databases.
In combination with cuFFT Device API, this function can be used to generate the LTO database containing both device function code and metadata (a C++ header file) for a specified FFT operation. See Custom LTO Helper for an example.
#include "cufftdx/utils.hpp"
namespace cufftdx {
namespace utils {
struct backend_impl_traits {
unsigned int size;
fft_type type;
fft_direction direction;
unsigned int sm;
unsigned int elements_per_thread;
unsigned int min_elements_per_thread;
};
enum class algorithm {
ct,
bluestein,
};
enum class execution_type {
thread,
block,
};
backend_impl_traits backend_traits =
frontend_to_backend(
algorithm algo,
execution_type exec_type,
unsigned int fft_size
/* size_of<FFT>::value */,
fft_type type
/* type_of<FFT>::value */,
fft_direction direction
/* direction_of<FFT>::value */,
unsigned int sm
/* sm_of<FFT>::value */,
real_mode real_mode
/* real_mode_of<FFT>::value */,
unsigned int elements_per_thread
/* elements_per_thread_of<FFT>::value or
* 0 if not set */,
unsigned int block_dim_x
/* block_dim_of<FFT>::x or 0 if not set */,
experimental::code_type code_type
/* experimental::code_type_of<FFT>::value */
);
} // namespace utils
} // namespace cufftdx
(online) LTO Database Creation#
cufftdx::utils::get_database_and_ltoir() returns a tuple containing:
Database string (
std::string) to be inserted into the cuFFTDx headers.Vector of LTOIRs (
std::vector<std::vector<char>>) for building device functions for the specified FFT operation.Required CUDA block dimensions (
Dim3) for executing the FFT operation.Required shared memory size (
unsigned, in bytes) for executing the FFT operation.
std::tuple<std::string, std::vector<std::vector<char>>, Dim3,
unsigned int>
cufftdx::utils::get_database_and_ltoir(
unsigned int fft_size,
cufftdx::fft_direction dir,
cufftdx::fft_type type,
unsigned int sm,
cufftdx::detail::execution_type execution,
cufftdx::precision prec =
cufftdx::precision::f32,
cufftdx::complex_layout layout =
cufftdx::complex_layout::natural,
cufftdx::real_mode rmode =
cufftdx::real_mode::normal,
unsigned int fft_ept = 0
/* use heuristic */,
unsigned int ffts_per_block = 1
/* 0: use suggested ffts_per_block */);
Example
// Assuming the following FFT operator is defined in the NVRTC-compiled code:
// using FFT = decltype(cufftdx::Block() +
// cufftdx::Size<128>() +
// cufftdx::Type<cufftdx::fft_type::c2c>() +
// cufftdx::Direction<cufftdx::fft_direction::forward>() +
// cufftdx::Precision<float>() +
// cufftdx::ElementsPerThread<8>() +
// cufftdx::FFTsPerBlock<2>() +
// cufftdx::SM<700>());
// You can get the database string and LTOIRs for the FFT operation by calling:
auto [lto_db, ltoirs, block_dim, sm_size] =
cufftdx::utils::get_database_and_ltoir(128,
cufftdx::fft_direction::forward,
cufftdx::fft_type::c2c,
700,
cufftdx::detail::execution_type::block,
cufftdx::precision::f32,
cufftdx::complex_layout::natural,
cufftdx::real_mode::normal,
8,
2);
After obtaining the database string and LTOIRs, you can insert the database string into the cuFFTDx header file and link the LTOIRs to the user code as shown in Use Case II: Online Kernel Generation.
Note
The cufftdx::utils::get_database_and_ltoir() function is a wrapper around cuFFT Device APIs (see cuFFT Device API Reference). To use this function:
Define
CUFFTDX_ENABLE_CUFFT_DEPENDENCY.Link against the cuFFT library.