Utils#

Common utilities for CUDA operations, error handling, and type-safe containers.

Overview#

The Utils library provides utilities for building robust CUDA applications with automatic resource management, type-safe error handling, and efficient data structures. It simplifies common patterns and reduces boilerplate code.

Key Features#

  • CUDA Stream Management: RAII wrapper for automatic stream lifecycle management

  • Error Handling: Standard C++ error codes compatible with std::error_code

  • Exception Classes: Type-safe exceptions for CUDA runtime and driver API errors

  • Error Macros: Convenient macros for checking and throwing on CUDA errors

  • Fixed-Size Arrays: STL-compatible array container for host and device code

Core Concepts#

CUDA Stream Management#

CudaStream provides RAII-based automatic lifetime management for CUDA streams. Streams are created as non-blocking and automatically synchronized and destroyed when the object goes out of scope.

Basic Stream Usage#

// Create a CUDA stream with automatic lifetime management
const CudaStream stream;

// Use the stream with CUDA operations
cudaStream_t handle = stream.get();

// Synchronize the stream
const bool success = stream.synchronize();

Moving Streams#

CudaStream supports move semantics for transferring ownership:

// Create initial stream
CudaStream stream1;
cudaStream_t handle1 = stream1.get();

// Move stream ownership
const CudaStream stream2 = std::move(stream1);
cudaStream_t handle2 = stream2.get();

Error Handling#

The library provides standard C++ error codes through NvErrc enum and integration with std::error_code. This enables idiomatic C++ error handling without exceptions when desired.

Error Code Usage#

// Create error code from NvErrc enum
const NvErrc error_code = NvErrc::Success;

// Check if error represents success
const bool is_success = (error_code == NvErrc::Success);

// Convert to std::error_code for standard error handling
const std::error_code std_error = make_error_code(error_code);

Error Code Conversion#

// Work with NvErrc error codes
const NvErrc nv_error = NvErrc::InvalidArgument;

// Convert to standard error_code
const std::error_code ec = make_error_code(nv_error);

// Check error category
const std::string category_name = ec.category().name();

// Get error message
const std::string message = ec.message();

Exception Classes#

Type-safe exception classes wrap CUDA errors and provide human-readable error messages.

CUDA Runtime Exceptions#

CudaRuntimeException wraps CUDA runtime API errors:

try {
    // Simulate CUDA error by calling invalid API
    FRAMEWORK_CUDA_RUNTIME_CHECK_THROW(cudaSetDevice(9999));
} catch (const CudaRuntimeException &ex) {
    // Exception caught and error message available
    const char *error_msg = ex.what();
    const bool caught = true;

CUDA Driver Exceptions#

CudaDriverException wraps CUDA driver API errors:

try {
    // Initialize CUDA driver
    FRAMEWORK_CUDA_DRIVER_CHECK_THROW(cuInit(0));

    CUdevice device{};
    // Simulate driver API error
    FRAMEWORK_CUDA_DRIVER_CHECK_THROW(cuDeviceGet(&device, 9999));
} catch (const CudaDriverException &ex) {
    // Exception caught with driver error details
    const std::string error_msg = ex.what();
    const bool caught = true;

Error Checking Macros#

Convenience macros simplify error checking and exception throwing. These macros automatically log error information with file and line number context.

The examples above demonstrate AERIAL_DSP_CUDA_RUNTIME_CHECK_THROW and AERIAL_DSP_CUDA_DRIVER_CHECK_THROW for automatic error checking. Additional macros provide conditional throwing and non-throwing variants:

const int value = 42;

try {
    // Throw exception if condition is met
    FRAMEWORK_NV_THROW_IF(value > 100, std::runtime_error, "Value exceeds maximum");

    // This code executes because condition is false
    const bool condition_passed = true;

Array Utilities#

Arr provides a fixed-size array container compatible with both host and device code. It offers STL-compatible iterators and bounds-checked access.

Basic Array Usage#

// Create fixed-size array with 3 elements
Arr<float, 3> vec;

// Access elements
vec[0] = 1.0F;
vec[1] = 2.0F;
vec[2] = 3.0F;

// Get size
const std::size_t size = vec.size();

Array Iteration#

Arr<int, 4> arr;
arr[0] = 10;
arr[1] = 20;
arr[2] = 30;
arr[3] = 40;

// Iterate using range-based for loop
int sum = 0;
for (const int value : arr) {
    sum += value;
}

Accessing Data#

Arr<double, 5> arr;
arr[0] = 1.5;
arr[1] = 2.5;
arr[2] = 3.5;

// Access underlying data pointer
const double *data_ptr = arr.data();

// Get array size
const std::size_t array_size = arr.size();

String Hashing#

TransparentStringHash enables heterogeneous lookup in unordered containers, eliminating temporary string allocations when using string literals or string_view as keys.

Basic Hash Usage#

// Create unordered_map with transparent string hash
std::unordered_map<std::string, int, TransparentStringHash, std::equal_to<>> map;

// Insert entries
map["first"] = 1;
map["second"] = 2;

// Lookup using string_view without allocating temporary string
const std::string_view key = "first";
const auto it = map.find(key);
const bool found = (it != map.end());

Efficient Lookups#

std::unordered_map<std::string, std::string, TransparentStringHash, std::equal_to<>> modules;
modules["cuda"] = "CUDA Runtime";
modules["driver"] = "CUDA Driver";

// Efficient lookup with string literal (no allocation)
const bool has_cuda = modules.contains("cuda");

// Lookup with string_view (no allocation)
const std::string_view driver_key = "driver";
const auto driver_it = modules.find(driver_key);

TransparentStringHash requires C++20 and must be used with a transparent comparator like std::equal_to<> to enable heterogeneous lookup.

Additional Examples#

For more examples, see framework/utils/tests/utils_sample_tests.cpp for documentation examples and sample usage patterns.

API Reference#

enum class framework::utils::NvErrc : std::uint8_t#

NV error codes compatible with std::error_code

This enum class provides a type-safe wrapper around nvStatus_t values that integrates seamlessly with the standard C++ error handling framework.

Values:

enumerator Success#

The API call returned with no errors.

enumerator InternalError#

An unexpected, internal error occurred.

enumerator NotSupported#

The requested function is not currently supported.

enumerator InvalidArgument#

One or more of the arguments provided to the function was invalid.

enumerator ArchMismatch#

The requested operation is not supported on the current architecture.

enumerator AllocFailed#

A memory allocation failed.

enumerator SizeMismatch#

The size of the operands provided to the function do not match.

enumerator MemcpyError#

An error occurred during a memcpy operation.

enumerator InvalidConversion#

An invalid conversion operation was requested.

enumerator UnsupportedType#

An operation was requested on an unsupported type.

enumerator UnsupportedLayout#

An operation was requested on an unsupported layout.

enumerator UnsupportedRank#

An operation was requested on an unsupported rank.

enumerator UnsupportedConfig#

An operation was requested on an unsupported configuration.

enumerator UnsupportedAlignment#

One or more API arguments don’t have the required alignment.

enumerator ValueOutOfRange#

Data conversion could not occur because an input value was out of range.

enumerator RefMismatch#

Mismatch found when comparing to TV.

using framework::utils::VariantTypes = std::variant<unsigned int, signed char, char2, unsigned char, uchar2, short, short2, unsigned short, ushort2, int, int2, uint2, __half_raw, __half2_raw, float, cuComplex, double, cuDoubleComplex>#

Variant type containing all possible types from cuphyVariant_t union

This type-safe variant replaces the C-style union used in cuphyVariant_t, providing compile-time type safety and modern C++ semantics.

constexpr bool framework::utils::GSL_CONTRACT_THROWS = false#

GSL contract violation behavior flag (true if violations throw exceptions)

framework::utils::DECLARE_LOG_COMPONENT(
Core,
CoreNvApi,
CoreCudaDriver,
CoreCudaRuntime,
CoreGraphManager,
CoreGraph,
CoreModule,
CorePipeline,
CoreFactory,
)#

Declare Core components for core subsystem.

template<typename Enum>
constexpr std::underlying_type_t<Enum> framework::utils::to_underlying(
const Enum e,
) noexcept#

Converts an enumeration to its underlying type

This function provides the same functionality as std::to_underlying from C++23 for earlier C++ standards. It safely converts an enumeration value to its underlying integral type.

Note

This function is equivalent to: static_cast<std::underlying_type_t<Enum>>(e)

Template Parameters:

Enum – The enumeration type to convert

Parameters:

e[in] The enumeration value to convert

Returns:

The enumeration value converted to its underlying type

inline const NvErrorCategory &framework::utils::nv_category(
) noexcept#

Get the singleton instance of the NV error category

Returns:

Reference to the NV error category

inline std::error_code framework::utils::make_error_code(
const NvErrc errc,
) noexcept#

Create an error_code from a NvErrc value

Parameters:

errc[in] The NV error code

Returns:

A std::error_code representing the NV error

constexpr NvErrc framework::utils::from_nv_status(
const int status,
) noexcept#

Convert a raw nvStatus_t value to NvErrc

Note

This function performs a static_cast and assumes the input is valid

Parameters:

status[in] Raw nvStatus_t value

Returns:

Equivalent NvErrc value

inline std::error_code framework::utils::make_error_code(
const int status,
) noexcept#

Create an error_code from a raw nvStatus_t value

Parameters:

status[in] Raw nvStatus_t value

Returns:

A std::error_code representing the NV error

constexpr bool framework::utils::is_success(
const NvErrc errc,
) noexcept#

Check if a NvErrc represents success

Parameters:

errc[in] The error code to check

Returns:

true if the error code represents success, false otherwise

inline bool framework::utils::is_nv_success(
const std::error_code &errc,
) noexcept#

Check if an error_code represents NV success

Parameters:

errc[in] The error code to check

Returns:

true if the error code represents NV success, false otherwise

inline const char *framework::utils::get_error_name(
const NvErrc errc,
) noexcept#

Get the name of a NvErrc enum value

Parameters:

errc[in] The error code

Returns:

The enum name as a string

template<typename T, std::size_t DIM>
class Arr#
#include <arr.hpp>

Fixed-size array container for mathematical operations

This class provides a lightweight, fixed-size array container optimized for mathematical operations in CUDA environments. It supports both host and device execution contexts and provides STL-compatible iterators.

Note

The array uses std::array for internal storage, which is fully compatible with CUDA device code when compiled with &#8212;expt-relaxed-constexpr. This is because std::array is a POD type and can be used in constexpr functions.

Template Parameters:
  • T – The element type (must be default constructible)

  • Dim – The number of elements in the array (must be > 0)

Public Functions

constexpr Arr() noexcept = default#

Default constructor - zero-initializes all elements

Creates an array with all elements initialized to their default value (zero for numeric types, false for bool, etc.).

template<std::size_t N>
inline explicit constexpr Arr(
const T (&arr)[N],
)#

Array constructor - initializes from C-style array

Constructs the array by copying elements from the provided C-style array. The array size must exactly match the array dimension.

Note

This constructor will fail to compile if N != DIM due to static_assert

Template Parameters:

N – The size of the input array (must equal DIM)

Parameters:

arr[in] The source C-style array reference to copy from

template<std::size_t N>
inline explicit Arr(
const std::array<T, N> &arr,
)#

Array constructor - initializes from std::array

Constructs the array by copying elements from the provided std::array. The array size must exactly match the array dimension.

Note

This constructor will fail to compile if N != DIM due to static_assert

Template Parameters:

N – The size of the input array (must equal DIM)

Parameters:

arr[in] The source std::array to copy from

inline void fill(T val)#

Fill all elements with the same value

Sets all elements of the array to the specified value.

Parameters:

val[in] The value to assign to all elements

inline T &operator[](std::size_t idx)#

Access element by index (mutable)

Provides direct access to the element at the specified index. No bounds checking is performed.

Note

No bounds checking is performed for performance reasons

Parameters:

idx[in] The index of the element to access

Returns:

Reference to the element at the specified index

inline const T &operator[](std::size_t idx) const#

Access element by index (immutable)

Provides read-only access to the element at the specified index. No bounds checking is performed.

Note

No bounds checking is performed for performance reasons

Parameters:

idx[in] The index of the element to access

Returns:

Const reference to the element at the specified index

inline constexpr T *begin() noexcept#

Get mutable iterator to beginning

Returns a pointer to the first element, enabling STL-style iteration.

Returns:

Pointer to the first element

inline constexpr T *end() noexcept#

Get mutable iterator to end

Returns a pointer to one past the last element, enabling STL-style iteration.

Returns:

Pointer to one past the last element

inline constexpr const T *begin() const noexcept#

Get const iterator to beginning

Returns a const pointer to the first element, enabling STL-style iteration for const arrays.

Returns:

Const pointer to the first element

inline constexpr const T *end() const noexcept#

Get const iterator to end

Returns a const pointer to one past the last element, enabling STL-style iteration for const arrays.

Returns:

Const pointer to one past the last element

inline constexpr T *data() noexcept#

Get a pointer to the data of the array

Returns a pointer to the first element of the array.

Returns:

A pointer to the first element of the array

inline constexpr const T *data() const noexcept#

Get a const pointer to the data of the array

Returns a const pointer to the first element of the array.

Returns:

A const pointer to the first element of the array

inline T product() const#

Calculate the product of all elements

Computes the product of all elements in the array (a₀ × a₁ × … × aₙ₋₁).

Returns:

The product of all elements

Public Static Functions

static inline constexpr std::size_t size() noexcept#

Get the size of the array

Returns the number of elements in the array.

Returns:

The number of elements in the array

Friends

inline friend bool operator==(const Arr &lhs, const Arr &rhs)#

Equality comparison operator

Compares two arrays element-wise for equality. Uses index-based comparison for CUDA-compatible operation.

Parameters:
  • lhs[in] The first array to compare

  • rhs[in] The second array to compare

Returns:

True if all corresponding elements are equal, false otherwise

inline friend bool operator!=(const Arr &lhs, const Arr &rhs)#

Inequality comparison operator

Compares two arrays element-wise for inequality.

Parameters:
  • lhs[in] The first array to compare

  • rhs[in] The second array to compare

Returns:

True if any corresponding elements are not equal, false otherwise

class CudaDriverException : public std::exception#
#include <exceptions.hpp>

Exception class for CUDA driver API errors

This exception wraps CUresult values and provides detailed error information including error names, descriptions, and optional user context.

Public Functions

inline explicit CudaDriverException(
const CUresult result,
const std::string_view user_str = "",
)#

Construct a CUDA driver exception from a driver API result code

Creates a detailed error message that includes the error name and description obtained from the CUDA driver API, along with optional user-provided context.

Parameters:
  • result[in] The CUDA driver API result code that caused the exception

  • user_str[in] Optional user-provided context string to include in the error message

inline const char *what() const noexcept override#

Get the error message for this exception

Returns:

Formatted error message containing error name, description, and optional user context

class CudaRuntimeException : public std::exception#
#include <exceptions.hpp>

Exception class for CUDA runtime API errors

This exception wraps cudaError_t values and provides human-readable error messages through the CUDA runtime API.

Public Functions

inline explicit CudaRuntimeException(const cudaError_t status)#

Construct a CUDA exception from a CUDA runtime error code

Parameters:

status[in] The CUDA runtime error code that caused the exception

inline const char *what() const noexcept override#

Get the error message for this exception

Returns:

Human-readable error message from cudaGetErrorString

class CudaStream#
#include <cuda_stream.hpp>

RAII wrapper for CUDA stream management

This class provides automatic lifetime management for cudaStream_t handles. The stream is created with cudaStreamNonBlocking flag and automatically synchronized and destroyed when the object goes out of scope.

Example usage:

{
    CudaStream stream;
    kernel<<<blocks, threads, 0, stream.get()>>>();
    stream.synchronize();  // Optional explicit sync
}  // Stream automatically synchronized and destroyed here

Public Functions

CudaStream()#

Create and initialize CUDA stream

Creates a non-blocking CUDA stream using cudaStreamNonBlocking flag.

Throws:

std::runtime_error – if CUDA stream creation fails

~CudaStream()#

Synchronize and destroy CUDA stream

Automatically synchronizes the stream before destroying it to ensure all queued operations complete. Errors during cleanup are logged but do not throw exceptions (destructor noexcept).

CudaStream(const CudaStream&) = delete#
CudaStream &operator=(const CudaStream&) = delete#
CudaStream(CudaStream &&other) noexcept#

Move constructor - transfer ownership of CUDA stream

Transfers ownership of the CUDA stream from another CudaStream object. The source object is left in a valid but empty state (nullptr stream).

Parameters:

other[in] Source CudaStream to move from

CudaStream &operator=(CudaStream &&other) noexcept#

Move assignment operator - transfer ownership of CUDA stream

Synchronizes and destroys the current stream (if any), then transfers ownership of the CUDA stream from another CudaStream object. The source object is left in a valid but empty state (nullptr stream).

Parameters:

other[in] Source CudaStream to move from

Returns:

Reference to this object

inline cudaStream_t get() const noexcept#

Get the underlying CUDA stream handle

Returns:

CUDA stream handle for use with CUDA APIs

bool synchronize() const noexcept#

Synchronize the CUDA stream

Blocks the calling CPU thread until all previously queued operations on this stream have completed.

Returns:

true if synchronization succeeded, false on error (error is logged)

class NvErrorCategory : public std::error_category#
#include <errors.hpp>

Custom error category for NV errors

This class provides human-readable error messages and integrates NV errors with the standard C++ error handling system.

Public Functions

inline const char *name() const noexcept override#

Get the name of this error category

Returns:

The category name as a C-style string

inline std::string message(const int condition) const override#

Get a descriptive message for the given error code – O(1) lookup, no heap allocation except for the implicit std::string construction required by the std::error_category interface.

Parameters:

condition[in] The error code value

Returns:

A descriptive error message

inline std::error_condition default_error_condition(
const int condition,
) const noexcept override#

Map NV errors to standard error conditions where applicable

Parameters:

condition[in] The error code value

Returns:

The equivalent standard error condition, or a default-constructed condition

Public Static Functions

static inline const char *name(const int condition)#

Get the name of the error code enum value

Parameters:

condition[in] The error code value

Returns:

The enum name as a string (e.g., “success”, “internal_error”)

class NvException : public std::exception#
#include <exceptions.hpp>

Exception class for NV library errors

This exception wraps NvErrc error codes and provides human-readable error messages through the NV error category system.

Public Functions

inline explicit NvException(const NvErrc status)#

Construct a NV exception from an error code

Parameters:

status[in] The NV error code that caused the exception

inline const char *what() const noexcept override#

Get the error message for this exception

Returns:

Human-readable error message from the NV error category

class NvFnException : public std::exception#
#include <exceptions.hpp>

Exception class for NV function-specific errors

This exception provides detailed error information including the function name that failed, along with the error code and description from the NV error system.

Public Functions

inline NvFnException(
const NvErrc status,
const std::string_view function_name_str,
)#

Construct a NV function exception with function context

Creates a detailed error message that includes the function name, error code message, and error code name for debugging purposes.

Parameters:
  • status[in] The NV error code that caused the exception

  • function_name_str[in] The name of the function that failed

inline const char *what() const noexcept override#

Get the error message for this exception

Returns:

Formatted error message containing function name, error message, and error name

struct TransparentStringHash#
#include <string_hash.hpp>

Transparent hash functor for string types that enables heterogeneous lookup.

This allows std::unordered_map<std::string, T> to perform lookups with string_view without constructing temporary strings, improving performance by avoiding allocations.

Background:

  • C++14 introduced heterogeneous lookup for ordered containers (std::map, std::set) using transparent comparators like std::less<>

  • C++20 extended this to unordered containers (std::unordered_map, std::unordered_set) requiring both transparent hash and comparator

  • The is_transparent tag enables the container to accept different key types

Example usage:

std::unordered_map<std::string, Module, TransparentStringHash,
std::equal_to<>> modules;

// Zero allocations - uses string_view directly
modules.find("key"sv);

// Zero allocations - uses const char* directly
modules.contains("literal");

// Works with std::string too
std::string key = "dynamic";
modules.find(key);

Performance benefits:

  • Eliminates temporary std::string allocations during lookups

  • Especially beneficial when using string literals or string_view keys

  • No overhead compared to std::hash<std::string> for std::string keys

Requirements:

  • C++20 for heterogeneous lookup support in unordered containers

  • Must be used with transparent comparator (e.g., std::equal_to<>)

Public Types

using is_transparent = void#

Tag enabling heterogeneous lookup.

Public Functions

inline std::size_t operator()(std::string_view sv) const noexcept#

Hash function for string_view and string-like types.

Uses std::hash<std::string_view> which works efficiently with all string-like types including std::string, std::string_view, and const char*.

Parameters:

sv[in] String view to hash

Returns:

Hash value