Utils#
Common utilities for CUDA operations, error handling, and type-safe containers.
Overview#
The Utils library provides utilities for building robust CUDA applications with automatic resource management, type-safe error handling, and efficient data structures. It simplifies common patterns and reduces boilerplate code.
Key Features#
CUDA Stream Management: RAII wrapper for automatic stream lifecycle management
Error Handling: Standard C++ error codes compatible with std::error_code
Exception Classes: Type-safe exceptions for CUDA runtime and driver API errors
Error Macros: Convenient macros for checking and throwing on CUDA errors
Fixed-Size Arrays: STL-compatible array container for host and device code
Core Concepts#
CUDA Stream Management#
CudaStream provides RAII-based automatic lifetime management for CUDA streams. Streams are created as non-blocking and automatically synchronized and destroyed when the object goes out of scope.
Basic Stream Usage#
// Create a CUDA stream with automatic lifetime management
const CudaStream stream;
// Use the stream with CUDA operations
cudaStream_t handle = stream.get();
// Synchronize the stream
const bool success = stream.synchronize();
Moving Streams#
CudaStream supports move semantics for transferring ownership:
// Create initial stream
CudaStream stream1;
cudaStream_t handle1 = stream1.get();
// Move stream ownership
const CudaStream stream2 = std::move(stream1);
cudaStream_t handle2 = stream2.get();
Error Handling#
The library provides standard C++ error codes through NvErrc enum and integration
with std::error_code. This enables idiomatic C++ error handling without exceptions
when desired.
Error Code Usage#
// Create error code from NvErrc enum
const NvErrc error_code = NvErrc::Success;
// Check if error represents success
const bool is_success = (error_code == NvErrc::Success);
// Convert to std::error_code for standard error handling
const std::error_code std_error = make_error_code(error_code);
Error Code Conversion#
// Work with NvErrc error codes
const NvErrc nv_error = NvErrc::InvalidArgument;
// Convert to standard error_code
const std::error_code ec = make_error_code(nv_error);
// Check error category
const std::string category_name = ec.category().name();
// Get error message
const std::string message = ec.message();
Exception Classes#
Type-safe exception classes wrap CUDA errors and provide human-readable error messages.
CUDA Runtime Exceptions#
CudaRuntimeException wraps CUDA runtime API errors:
try {
// Simulate CUDA error by calling invalid API
FRAMEWORK_CUDA_RUNTIME_CHECK_THROW(cudaSetDevice(9999));
} catch (const CudaRuntimeException &ex) {
// Exception caught and error message available
const char *error_msg = ex.what();
const bool caught = true;
CUDA Driver Exceptions#
CudaDriverException wraps CUDA driver API errors:
try {
// Initialize CUDA driver
FRAMEWORK_CUDA_DRIVER_CHECK_THROW(cuInit(0));
CUdevice device{};
// Simulate driver API error
FRAMEWORK_CUDA_DRIVER_CHECK_THROW(cuDeviceGet(&device, 9999));
} catch (const CudaDriverException &ex) {
// Exception caught with driver error details
const std::string error_msg = ex.what();
const bool caught = true;
Error Checking Macros#
Convenience macros simplify error checking and exception throwing. These macros automatically log error information with file and line number context.
The examples above demonstrate AERIAL_DSP_CUDA_RUNTIME_CHECK_THROW and
AERIAL_DSP_CUDA_DRIVER_CHECK_THROW for automatic error checking. Additional
macros provide conditional throwing and non-throwing variants:
const int value = 42;
try {
// Throw exception if condition is met
FRAMEWORK_NV_THROW_IF(value > 100, std::runtime_error, "Value exceeds maximum");
// This code executes because condition is false
const bool condition_passed = true;
Array Utilities#
Arr provides a fixed-size array container compatible with both host and device code. It offers STL-compatible iterators and bounds-checked access.
Basic Array Usage#
// Create fixed-size array with 3 elements
Arr<float, 3> vec;
// Access elements
vec[0] = 1.0F;
vec[1] = 2.0F;
vec[2] = 3.0F;
// Get size
const std::size_t size = vec.size();
Array Iteration#
Arr<int, 4> arr;
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
arr[3] = 40;
// Iterate using range-based for loop
int sum = 0;
for (const int value : arr) {
sum += value;
}
Accessing Data#
Arr<double, 5> arr;
arr[0] = 1.5;
arr[1] = 2.5;
arr[2] = 3.5;
// Access underlying data pointer
const double *data_ptr = arr.data();
// Get array size
const std::size_t array_size = arr.size();
String Hashing#
TransparentStringHash enables heterogeneous lookup in unordered containers, eliminating temporary string allocations when using string literals or string_view as keys.
Basic Hash Usage#
// Create unordered_map with transparent string hash
std::unordered_map<std::string, int, TransparentStringHash, std::equal_to<>> map;
// Insert entries
map["first"] = 1;
map["second"] = 2;
// Lookup using string_view without allocating temporary string
const std::string_view key = "first";
const auto it = map.find(key);
const bool found = (it != map.end());
Efficient Lookups#
std::unordered_map<std::string, std::string, TransparentStringHash, std::equal_to<>> modules;
modules["cuda"] = "CUDA Runtime";
modules["driver"] = "CUDA Driver";
// Efficient lookup with string literal (no allocation)
const bool has_cuda = modules.contains("cuda");
// Lookup with string_view (no allocation)
const std::string_view driver_key = "driver";
const auto driver_it = modules.find(driver_key);
TransparentStringHash requires C++20 and must be used with a transparent comparator
like std::equal_to<> to enable heterogeneous lookup.
Additional Examples#
For more examples, see framework/utils/tests/utils_sample_tests.cpp for
documentation examples and sample usage patterns.
API Reference#
-
enum class framework::utils::NvErrc : std::uint8_t#
NV error codes compatible with std::error_code
This enum class provides a type-safe wrapper around nvStatus_t values that integrates seamlessly with the standard C++ error handling framework.
Values:
-
enumerator Success#
The API call returned with no errors.
-
enumerator InternalError#
An unexpected, internal error occurred.
-
enumerator NotSupported#
The requested function is not currently supported.
-
enumerator InvalidArgument#
One or more of the arguments provided to the function was invalid.
-
enumerator ArchMismatch#
The requested operation is not supported on the current architecture.
-
enumerator AllocFailed#
A memory allocation failed.
-
enumerator SizeMismatch#
The size of the operands provided to the function do not match.
-
enumerator MemcpyError#
An error occurred during a memcpy operation.
-
enumerator InvalidConversion#
An invalid conversion operation was requested.
-
enumerator UnsupportedType#
An operation was requested on an unsupported type.
-
enumerator UnsupportedLayout#
An operation was requested on an unsupported layout.
-
enumerator UnsupportedRank#
An operation was requested on an unsupported rank.
-
enumerator UnsupportedConfig#
An operation was requested on an unsupported configuration.
-
enumerator UnsupportedAlignment#
One or more API arguments don’t have the required alignment.
-
enumerator ValueOutOfRange#
Data conversion could not occur because an input value was out of range.
-
enumerator RefMismatch#
Mismatch found when comparing to TV.
-
enumerator Success#
-
using framework::utils::VariantTypes = std::variant<unsigned int, signed char, char2, unsigned char, uchar2, short, short2, unsigned short, ushort2, int, int2, uint2, __half_raw, __half2_raw, float, cuComplex, double, cuDoubleComplex>#
Variant type containing all possible types from cuphyVariant_t union
This type-safe variant replaces the C-style union used in cuphyVariant_t, providing compile-time type safety and modern C++ semantics.
-
constexpr bool framework::utils::GSL_CONTRACT_THROWS = false#
GSL contract violation behavior flag (true if violations throw exceptions)
- framework::utils::DECLARE_LOG_COMPONENT(
- Core,
- CoreNvApi,
- CoreCudaDriver,
- CoreCudaRuntime,
- CoreGraphManager,
- CoreGraph,
- CoreModule,
- CorePipeline,
- CoreFactory,
Declare Core components for core subsystem.
-
template<typename Enum>
constexpr std::underlying_type_t<Enum> framework::utils::to_underlying( - const Enum e,
Converts an enumeration to its underlying type
This function provides the same functionality as std::to_underlying from C++23 for earlier C++ standards. It safely converts an enumeration value to its underlying integral type.
Note
This function is equivalent to: static_cast<std::underlying_type_t<Enum>>(e)
- Template Parameters:
Enum – The enumeration type to convert
- Parameters:
e – [in] The enumeration value to convert
- Returns:
The enumeration value converted to its underlying type
- inline const NvErrorCategory &framework::utils::nv_category(
Get the singleton instance of the NV error category
- Returns:
Reference to the NV error category
- inline std::error_code framework::utils::make_error_code(
- const NvErrc errc,
Create an error_code from a NvErrc value
- Parameters:
errc – [in] The NV error code
- Returns:
A std::error_code representing the NV error
- constexpr NvErrc framework::utils::from_nv_status(
- const int status,
Convert a raw nvStatus_t value to NvErrc
Note
This function performs a static_cast and assumes the input is valid
- Parameters:
status – [in] Raw nvStatus_t value
- Returns:
Equivalent NvErrc value
- inline std::error_code framework::utils::make_error_code(
- const int status,
Create an error_code from a raw nvStatus_t value
- Parameters:
status – [in] Raw nvStatus_t value
- Returns:
A std::error_code representing the NV error
- constexpr bool framework::utils::is_success(
- const NvErrc errc,
Check if a NvErrc represents success
- Parameters:
errc – [in] The error code to check
- Returns:
true if the error code represents success, false otherwise
- inline bool framework::utils::is_nv_success(
- const std::error_code &errc,
Check if an error_code represents NV success
- Parameters:
errc – [in] The error code to check
- Returns:
true if the error code represents NV success, false otherwise
- inline const char *framework::utils::get_error_name(
- const NvErrc errc,
Get the name of a NvErrc enum value
- Parameters:
errc – [in] The error code
- Returns:
The enum name as a string
-
template<typename T, std::size_t DIM>
class Arr# - #include <arr.hpp>
Fixed-size array container for mathematical operations
This class provides a lightweight, fixed-size array container optimized for mathematical operations in CUDA environments. It supports both host and device execution contexts and provides STL-compatible iterators.
Note
The array uses std::array for internal storage, which is fully compatible with CUDA device code when compiled with —expt-relaxed-constexpr. This is because std::array is a POD type and can be used in constexpr functions.
- Template Parameters:
T – The element type (must be default constructible)
Dim – The number of elements in the array (must be > 0)
Public Functions
-
constexpr Arr() noexcept = default#
Default constructor - zero-initializes all elements
Creates an array with all elements initialized to their default value (zero for numeric types, false for bool, etc.).
-
template<std::size_t N>
inline explicit constexpr Arr(
)# Array constructor - initializes from C-style array
Constructs the array by copying elements from the provided C-style array. The array size must exactly match the array dimension.
Note
This constructor will fail to compile if N != DIM due to static_assert
- Template Parameters:
N – The size of the input array (must equal DIM)
- Parameters:
arr – [in] The source C-style array reference to copy from
-
template<std::size_t N>
inline explicit Arr(
)# Array constructor - initializes from std::array
Constructs the array by copying elements from the provided std::array. The array size must exactly match the array dimension.
Note
This constructor will fail to compile if N != DIM due to static_assert
- Template Parameters:
N – The size of the input array (must equal DIM)
- Parameters:
arr – [in] The source std::array to copy from
-
inline void fill(T val)#
Fill all elements with the same value
Sets all elements of the array to the specified value.
- Parameters:
val – [in] The value to assign to all elements
-
inline T &operator[](std::size_t idx)#
Access element by index (mutable)
Provides direct access to the element at the specified index. No bounds checking is performed.
Note
No bounds checking is performed for performance reasons
- Parameters:
idx – [in] The index of the element to access
- Returns:
Reference to the element at the specified index
-
inline const T &operator[](std::size_t idx) const#
Access element by index (immutable)
Provides read-only access to the element at the specified index. No bounds checking is performed.
Note
No bounds checking is performed for performance reasons
- Parameters:
idx – [in] The index of the element to access
- Returns:
Const reference to the element at the specified index
-
inline constexpr T *begin() noexcept#
Get mutable iterator to beginning
Returns a pointer to the first element, enabling STL-style iteration.
- Returns:
Pointer to the first element
-
inline constexpr T *end() noexcept#
Get mutable iterator to end
Returns a pointer to one past the last element, enabling STL-style iteration.
- Returns:
Pointer to one past the last element
-
inline constexpr const T *begin() const noexcept#
Get const iterator to beginning
Returns a const pointer to the first element, enabling STL-style iteration for const arrays.
- Returns:
Const pointer to the first element
-
inline constexpr const T *end() const noexcept#
Get const iterator to end
Returns a const pointer to one past the last element, enabling STL-style iteration for const arrays.
- Returns:
Const pointer to one past the last element
-
inline constexpr T *data() noexcept#
Get a pointer to the data of the array
Returns a pointer to the first element of the array.
- Returns:
A pointer to the first element of the array
Public Static Functions
-
static inline constexpr std::size_t size() noexcept#
Get the size of the array
Returns the number of elements in the array.
- Returns:
The number of elements in the array
Friends
-
inline friend bool operator==(const Arr &lhs, const Arr &rhs)#
Equality comparison operator
Compares two arrays element-wise for equality. Uses index-based comparison for CUDA-compatible operation.
- Parameters:
lhs – [in] The first array to compare
rhs – [in] The second array to compare
- Returns:
True if all corresponding elements are equal, false otherwise
-
inline friend bool operator!=(const Arr &lhs, const Arr &rhs)#
Inequality comparison operator
Compares two arrays element-wise for inequality.
- Parameters:
lhs – [in] The first array to compare
rhs – [in] The second array to compare
- Returns:
True if any corresponding elements are not equal, false otherwise
-
class CudaDriverException : public std::exception#
- #include <exceptions.hpp>
Exception class for CUDA driver API errors
This exception wraps CUresult values and provides detailed error information including error names, descriptions, and optional user context.
Public Functions
- inline explicit CudaDriverException(
- const CUresult result,
- const std::string_view user_str = "",
Construct a CUDA driver exception from a driver API result code
Creates a detailed error message that includes the error name and description obtained from the CUDA driver API, along with optional user-provided context.
- Parameters:
result – [in] The CUDA driver API result code that caused the exception
user_str – [in] Optional user-provided context string to include in the error message
-
inline const char *what() const noexcept override#
Get the error message for this exception
- Returns:
Formatted error message containing error name, description, and optional user context
-
class CudaRuntimeException : public std::exception#
- #include <exceptions.hpp>
Exception class for CUDA runtime API errors
This exception wraps cudaError_t values and provides human-readable error messages through the CUDA runtime API.
Public Functions
-
inline explicit CudaRuntimeException(const cudaError_t status)#
Construct a CUDA exception from a CUDA runtime error code
- Parameters:
status – [in] The CUDA runtime error code that caused the exception
-
inline const char *what() const noexcept override#
Get the error message for this exception
- Returns:
Human-readable error message from cudaGetErrorString
-
inline explicit CudaRuntimeException(const cudaError_t status)#
-
class CudaStream#
- #include <cuda_stream.hpp>
RAII wrapper for CUDA stream management
This class provides automatic lifetime management for cudaStream_t handles. The stream is created with cudaStreamNonBlocking flag and automatically synchronized and destroyed when the object goes out of scope.
Example usage:
{ CudaStream stream; kernel<<<blocks, threads, 0, stream.get()>>>(); stream.synchronize(); // Optional explicit sync } // Stream automatically synchronized and destroyed here
Public Functions
-
CudaStream()#
Create and initialize CUDA stream
Creates a non-blocking CUDA stream using cudaStreamNonBlocking flag.
- Throws:
std::runtime_error – if CUDA stream creation fails
-
~CudaStream()#
Synchronize and destroy CUDA stream
Automatically synchronizes the stream before destroying it to ensure all queued operations complete. Errors during cleanup are logged but do not throw exceptions (destructor noexcept).
-
CudaStream(const CudaStream&) = delete#
-
CudaStream &operator=(const CudaStream&) = delete#
-
CudaStream(CudaStream &&other) noexcept#
Move constructor - transfer ownership of CUDA stream
Transfers ownership of the CUDA stream from another CudaStream object. The source object is left in a valid but empty state (nullptr stream).
- Parameters:
other – [in] Source CudaStream to move from
-
CudaStream &operator=(CudaStream &&other) noexcept#
Move assignment operator - transfer ownership of CUDA stream
Synchronizes and destroys the current stream (if any), then transfers ownership of the CUDA stream from another CudaStream object. The source object is left in a valid but empty state (nullptr stream).
- Parameters:
other – [in] Source CudaStream to move from
- Returns:
Reference to this object
-
inline cudaStream_t get() const noexcept#
Get the underlying CUDA stream handle
- Returns:
CUDA stream handle for use with CUDA APIs
-
bool synchronize() const noexcept#
Synchronize the CUDA stream
Blocks the calling CPU thread until all previously queued operations on this stream have completed.
- Returns:
true if synchronization succeeded, false on error (error is logged)
-
CudaStream()#
-
class NvErrorCategory : public std::error_category#
- #include <errors.hpp>
Custom error category for NV errors
This class provides human-readable error messages and integrates NV errors with the standard C++ error handling system.
Public Functions
-
inline const char *name() const noexcept override#
Get the name of this error category
- Returns:
The category name as a C-style string
-
inline std::string message(const int condition) const override#
Get a descriptive message for the given error code – O(1) lookup, no heap allocation except for the implicit std::string construction required by the std::error_category interface.
- Parameters:
condition – [in] The error code value
- Returns:
A descriptive error message
- inline std::error_condition default_error_condition(
- const int condition,
Map NV errors to standard error conditions where applicable
- Parameters:
condition – [in] The error code value
- Returns:
The equivalent standard error condition, or a default-constructed condition
Public Static Functions
-
static inline const char *name(const int condition)#
Get the name of the error code enum value
- Parameters:
condition – [in] The error code value
- Returns:
The enum name as a string (e.g., “success”, “internal_error”)
-
inline const char *name() const noexcept override#
-
class NvException : public std::exception#
- #include <exceptions.hpp>
Exception class for NV library errors
This exception wraps NvErrc error codes and provides human-readable error messages through the NV error category system.
Public Functions
-
inline explicit NvException(const NvErrc status)#
Construct a NV exception from an error code
- Parameters:
status – [in] The NV error code that caused the exception
-
inline const char *what() const noexcept override#
Get the error message for this exception
- Returns:
Human-readable error message from the NV error category
-
inline explicit NvException(const NvErrc status)#
-
class NvFnException : public std::exception#
- #include <exceptions.hpp>
Exception class for NV function-specific errors
This exception provides detailed error information including the function name that failed, along with the error code and description from the NV error system.
Public Functions
- inline NvFnException(
- const NvErrc status,
- const std::string_view function_name_str,
Construct a NV function exception with function context
Creates a detailed error message that includes the function name, error code message, and error code name for debugging purposes.
- Parameters:
status – [in] The NV error code that caused the exception
function_name_str – [in] The name of the function that failed
-
inline const char *what() const noexcept override#
Get the error message for this exception
- Returns:
Formatted error message containing function name, error message, and error name
-
struct TransparentStringHash#
- #include <string_hash.hpp>
Transparent hash functor for string types that enables heterogeneous lookup.
This allows std::unordered_map<std::string, T> to perform lookups with string_view without constructing temporary strings, improving performance by avoiding allocations.
Background:
C++14 introduced heterogeneous lookup for ordered containers (std::map, std::set) using transparent comparators like std::less<>
C++20 extended this to unordered containers (std::unordered_map, std::unordered_set) requiring both transparent hash and comparator
The is_transparent tag enables the container to accept different key types
Example usage:
std::unordered_map<std::string, Module, TransparentStringHash, std::equal_to<>> modules; // Zero allocations - uses string_view directly modules.find("key"sv); // Zero allocations - uses const char* directly modules.contains("literal"); // Works with std::string too std::string key = "dynamic"; modules.find(key);
Performance benefits:
Eliminates temporary std::string allocations during lookups
Especially beneficial when using string literals or string_view keys
No overhead compared to std::hash<std::string> for std::string keys
Requirements:
C++20 for heterogeneous lookup support in unordered containers
Must be used with transparent comparator (e.g., std::equal_to<>)
Public Types
-
using is_transparent = void#
Tag enabling heterogeneous lookup.
Public Functions
-
inline std::size_t operator()(std::string_view sv) const noexcept#
Hash function for string_view and string-like types.
Uses std::hash<std::string_view> which works efficiently with all string-like types including std::string, std::string_view, and const char*.
- Parameters:
sv – [in] String view to hash
- Returns:
Hash value