Constant Wrappers
This section describes wrapper types that carry constant values as well as API modifier flags. Using these types, one can pass compile time information to an API through a function argument instead of an explicit template argument.
Example
The following code shows how to use pass compile time values using both the explicit template argument and constant wrapper techniques.
namespace ct = ::cuda::tiles;
using namespace ct::literals;
// Example 1
ct::shape<2, 4> x;
ct::shape y{2_ic, 4_ic};
// Example 2
ct::dimension_map<2, 0, 1> w;
ct::dimension_map z{2_ic, 0_ic, 1_ic};
// Example 3
ct::add<ct::rounding_mode::round_toward_zero>(0.0, 1.0, {});
ct::add(0.0, 1, ct::round_toward_negative_t{});
// Example 4
double* ptr{nullptr};
ct::atomic_load<ct::memory_order::relaxed>(ptr);
ct::atomic_load(ptr, ct::memory_order_relaxed_t{});
// Example 5
ct::partition_view P{
ct::tensor_span{ptr, ct::shape{4_ic, 8_ic}},
ct::shape{2_ic, 2_ic}};
P.load_masked<ct::view_padding::positive_inf>(0, 1);
P.load_masked(ct::view_padding_positive_inf_t{}, 0, 1);
cuda::tiles::integral_constant
-
template<ct::integral auto V>
struct integral_constant
-
A
ct::integral_constantis a stateless empty type that encodes a compile time integral value.Aliases
-
using type = integral_constant
-
using value_type = decltype(V)
Member Variables
-
static constexpr value_type value = V
Conversion Operator
-
__tile__ __host__ __device__ constexpr operator value_type() const noexcept;
-
Conversion operator yielding the value
V.
Function Call Operator
-
using type = integral_constant
Overloaded Operators
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x + y> operator+(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x * y> operator*(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x - y> operator-(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x / y> operator/(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x % y> operator%(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x & y> operator&(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x | y> operator|(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<x ^ y> operator^(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<(x << y)> operator<<(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x, auto y>
__tile__ __host__ __device__ constexpr ct::integral_constant<(x >> y)> operator>>(ct::integral_constant<x>, ct::integral_constant<y>) noexcept;
-
template<auto x>
__tile__ __host__ __device__ constexpr ct::integral_constant<~x> operator~(ct::integral_constant<x>) noexcept;
-
template<auto x>
__tile__ __host__ __device__ constexpr ct::integral_constant<+x> operator+(ct::integral_constant<x>) noexcept;
-
template<auto x>
__tile__ __host__ __device__ constexpr ct::integral_constant<-x> operator-(ct::integral_constant<x>) noexcept;
-
Overloaded operators for
ct::integral_constant. Each operator yields default constructed instance of its return type.
Literal Operator Template
-
template<char...>
__tile__ __host__ __device__ constexpr ct::integral_constant</* see below */> operator""_ic() noexcept;
-
A literal operator template 1 for integer literals suffixed with
_ic. This function inhabits thecuda::tiles::literalsnamespace.Yields a default constructed instance of
ct::integral_constantwhose constant value matches the value of the provided integer literal. The value type of the result matches the type the integer literal would have had the_icsuffix been omitted 2 except the program is ill-formed if that type would be an extended integer type.The program is ill-formed if the provided literal is not a decimal integer literal.
Example
The following code shows two
_icliterals and their corresponding types.namespace ct = ::cuda::tiles; using namespace ct::literals; // Type is `int` 22_ic; // Type is either `long int` or `long long int`. 4294967296_ic;
cuda::tiles::dimension_map
-
template<size_t... Idx>
requires /* atomic constraint */
struct dimension_map;
-
Stateless empty type representing a permutation of the integers \(0\) through
sizeof...(Idx)exclusive. Element \(k\) of theIdxpack indicates which dimension of the source should be placed at dimension \(k\) in the target during act::permute()operation. The rank of act::dimension_mapis the size of theIdxpack.The atomic constraint validates the
Idxpack is a permutation of the integers \(0\) through \(N-1\) inclusive where \(N\) is the rank. That is, each of theIdxvalues lies in the half open range \([0 N)\) and there are no duplicates. An emptyIdxpack represents a valid permutation.Default Construction
-
__tile__ __host__ __device__ constexpr dimension_map() noexcept = default;
-
Default constructs an instance of this type.
Construction From Arguments
-
template<typename ...Ts>
requires (sizeof...(Ts) > 0)
__tile__ __host__ __device__ explicit constexpr dimension_map(Ts...) noexcept;
-
Constructs an instance of this type, discarding any arguments that were passed to it. This constructor is used in conjunction with the deduction guide to enable CTAD based construction from
ct::integral_constantvalues.
rank
-
__tile__ __host__ __device__ static constexpr size_t rank() noexcept;
-
Yields the size of the Idx pack.
mapping
-
__tile__ __host__ __device__ static constexpr size_t mapping(rank_type i) noexcept;
-
Yields the \(i^{th}\) element of pack Idx. The behavior is undefined if \(i \geq N\) where \(N\) is parameter pack size.
Deduction Guide
-
template<auto... Vs>
requires /* atomic constraint */
dimension_map(integral_constant<Vs>...) -> dimension_map<size_t(Vs)...>;
-
Deduction guide enabling CTAD from
ct::integral_constantarguments. The atomic constraint validates that the expressionsize_t(Vs)is well formed for each element of theVspack.Example
namespace ct = ::cuda::tiles; using namespace ct::literals; // Type: ct::dimension_map<2, 0, 1> ct::dimension_map{2_ic, 0_ic, 1_ic};
-
__tile__ __host__ __device__ constexpr dimension_map() noexcept = default;
Rounding Mode
APIs for working with round modes.
cuda::tiles::rounding_mode
cuda::tiles::rounding_mode_constant
-
template<ct::rounding_mode Value>
requires /* atomic constraint */
struct rounding_mode_constant;
-
A specialization of
rounding_mode_constantencodes a rounding mode in its type. The atomic constraint validates thatValueis an enumerator ofct::rounding_mode.Example
Example usage of
rounding_mode_constantto infer the rounding mode non-type template parameter ofct::add()using a function argument.namespace ct = ::cuda::tiles; ct::rounding_mode_constant< ct::rounding_mode::round_toward_negative> mode; auto result = ct::add(1.0, 2.5, mode);
Member Aliases
-
using type = rounding_mode_constant;
-
using value_type = ct::rounding_mode;
Member Variables
-
static constexpr ct::rounding_mode value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::rounding_mode() const noexcept;
-
Conversion operator yielding Value.
-
__tile__ __host__ __device__ constexpr ct::rounding_mode operator()() const noexcept;
-
Call operator yielding Value.
-
using type = rounding_mode_constant;
Constant Aliases
-
using round_ties_to_even_t = ct::rounding_mode_constant<ct::rounding_mode::round_ties_to_even>;
-
using round_toward_zero_t = ct::rounding_mode_constant<ct::rounding_mode::round_toward_zero>;
-
using round_toward_negative_t = ct::rounding_mode_constant<ct::rounding_mode::round_toward_negative>;
-
using round_toward_positive_t = ct::rounding_mode_constant<ct::rounding_mode::round_toward_positive>;
-
using round_full_t = ct::rounding_mode_constant<ct::rounding_mode::round_full>;
-
using round_approximate_t = ct::rounding_mode_constant<ct::rounding_mode::round_approximate>;
-
using default_rounding_mode_t = ct::rounding_mode_constant<ct::default_rounding_mode()>;
-
Convenience type aliases for inferring the rounding mode non-type template parameter of arithmetic and math APIs.
Example
Example usage of the rounding mode constant aliases:
namespace ct = ::cuda::tiles; auto result = ct::add(1.0, 2.5, ct::round_toward_negative_t{});
cuda::tiles::default_rounding_mode
-
__tile__ __host__ __device__ constexpr ct::rounding_mode default_rounding_mode() noexcept;
-
Yields the default rounding mode which is Round Ties to Even.
Subnormals Rounding Mode
APIs for working with subnormals rounding modes.
cuda::tiles::subnormals_rounding_mode
-
enum class subnormals_rounding_mode : /* unspecified */
-
Type enumerating the available subnormals rounding modes.
cuda::tiles::subnormals_rounding_mode_constant
-
template<ct::subnormals_rounding_mode Value>
requires /* atomic constraint */
struct subnormals_rounding_mode_constant;
-
A specialization of
subnormals_rounding_mode_constantencodes a subnormals rounding mode in its type. The atomic constraint validates thatValueis one of the enumerators ofct::subnormals_rounding_mode.Example
Example usage of
subnormals_rounding_mode_constantto infer the subnormals rounding mode non-type template parameter ofct::add()using a function argument.namespace ct = ::cuda::tiles; ct::subnormals_rounding_mode_constant< ct::subnormals_rounding_mode::round_subnormals_to_zero> mode; auto result = ct::add(1.0f, 2.5f, {}, mode);
Member Aliases
-
using type = subnormals_rounding_mode_constant;
-
using value_type = ct::subnormals_rounding_mode;
Member Variables
-
static constexpr ct::subnormals_rounding_mode value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::subnormals_rounding_mode() const noexcept;
-
Conversion operator yielding Value.
-
__tile__ __host__ __device__ constexpr ct::subnormals_rounding_mode operator()() const noexcept;
-
Call operator yielding Value.
-
using type = subnormals_rounding_mode_constant;
Constant Aliases
-
using preserve_subnormals_t = ct::subnormals_rounding_mode_constant<ct::subnormals_rounding_mode::preserve_subnormals>;
-
using round_subnormals_to_zero_t = ct::subnormals_rounding_mode_constant<ct::subnormals_rounding_mode::round_subnormals_to_zero>;
-
using default_subnormals_rounding_mode_t = ct::subnormals_rounding_mode_constant<ct::default_subnormals_rounding_mode()>;
-
Convenience type aliases for inferring the subnormals rounding mode non-type template parameter of arithmetic and math APIs.
Example
Example usage of the subnormals rounding mode constant aliases:
namespace ct = ::cuda::tiles; auto result = ct::add(1.0f, 2.5f, {}, ct::round_subnormals_to_zero_t{});
cuda::tiles::default_subnormals_rounding_mode
-
__tile__ __host__ __device__ constexpr ct::subnormals_rounding_mode default_subnormals_rounding_mode() noexcept;
-
Returns the default subnormals rounding mode which is Preserve Subnormals.
NaN Propagation Mode
APIs for working with NaN propagation modes.
cuda::tiles::nan_propagation_mode
-
enum class nan_propagation_mode : /* unspecified */
-
Type enumerating the available NaN propagation modes.
cuda::tiles::nan_propagation_constant
-
template<ct::nan_propagation_mode Value>
requires /* atomic constraint */
struct nan_propagation_mode_constant;
-
A specialization of
ct::nan_propagation_modeencodes a NaN propagation modes in its type. The atomic constraint validates that Value is an enumerator ofct::nan_propagation_mode.Member Aliases
-
using type = nan_propagation_mode_constant;
-
using value_type = ct::nan_propagation_mode;
Member Variables
-
static constexpr ct::nan_propagation_mode value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::nan_propagation_mode() const noexcept;
-
Yields the propagation mode constant Value.
-
__tile__ __host__ __device__ constexpr ct::nan_propagation_mode operator()() const noexcept;
-
Yields the propagation mode constant Value.
-
using type = nan_propagation_mode_constant;
Constant Aliases
-
using suppress_nan_t = ct::nan_propagation_mode_constant<ct::nan_propagation_mode::suppress_nan>;
-
using propagate_nan_t = ct::nan_propagation_mode_constant<ct::nan_propagation_mode::propagate_nan>;
-
using default_nan_propagation_mode_t = ct::nan_propagation_mode_constant<ct::default_nan_propagation_mode()>;
-
Convenience type aliases for inferring the NaN propagation mode non-type template parameter of arithmetic APIs.
Example
Example usage of the NaN propagation mode constant aliases:
namespace ct = ::cuda::tiles; auto result = ct::max(1.0, 2.5, ct::propagate_nan_t{});
cuda::tiles::default_nan_propagation_mode
-
__tile__ __host__ __device__ inline constexpr ct::nan_propagation_mode default_nan_propagation_mode();
-
Yields the default nan propagation mode.
View Padding
APIs for working with view padding values.
cuda::tiles::view_padding
cuda::tiles::view_padding_constant
-
template<ct::view_padding Value>
requires /* atomic constraint */
struct view_padding_constant;
-
A specialization of
ct::view_padding_constantencodes a view padding value in its type. The atomic constraint validates thatValueis an enumerator ofct::view_padding.Member Aliases
-
using type = view_padding_constant;
-
using value_type = ct::view_padding;
Member Variables
-
static constexpr ct::view_padding value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::view_padding() const noexcept;
-
Yields the value Value.
-
__tile__ __host__ __device__ constexpr ct::view_padding operator()() const noexcept;
-
Yields the value Value.
-
using type = view_padding_constant;
Constant Aliases
-
using view_padding_zero_t = ct::view_padding_constant<ct::view_padding::zero>;
-
using view_padding_negative_zero_t = ct::view_padding_constant<ct::view_padding::negative_zero>;
-
using view_padding_positive_inf_t = ct::view_padding_constant<ct::view_padding::positive_inf>;
-
using view_padding_negative_inf_t = ct::view_padding_constant<ct::view_padding::negative_inf>;
-
using view_padding_nan_t = ct::view_padding_constant<ct::view_padding::nan>;
-
using default_view_padding_t = ct::view_padding_constant<ct::default_view_padding()>;
-
Convenience aliases for specifying a view padding constant in view operations.
Example
Example usage of the view padding constant aliases:
namespace ct = ::cuda::tiles; using namespace ct::literals; int x = 0; ct::tensor_span t{&x, ct::shape{1_ic}}; ct::partition_view p{t, ct::shape{1_ic}}; p.load_masked(ct::view_padding_zero_t{}, 0);
cuda::tiles::default_view_padding
-
__tile__ __host__ __device__ inline constexpr ct::view_padding default_view_padding();
-
Yields the default view padding.
Memory Order
APIs for working with memory orders.
cuda::tiles::memory_order
cuda::tiles::memory_order_constant
-
template<memory_order Value>
requires /* atomic constraint */
struct memory_order_constant;
-
A specialization of
ct::memory_order_constantencodes a memory order value in its type. The atomic constraint validates thatValueis an enumerator ofct::memory_order.Member Aliases
-
using type = memory_order_constant;
-
using value_type = ct::memory_order;
Member Variables
-
static constexpr ct::memory_order value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::memory_order() const noexcept;
-
Yields Value.
-
__tile__ __host__ __device__ constexpr ct::memory_order operator()() const noexcept;
-
Yields Value.
-
using type = memory_order_constant;
Constant Aliases
-
using memory_order_relaxed_t = ct::memory_order_constant<ct::memory_order::relaxed>;
-
using memory_order_acquire_t = ct::memory_order_constant<ct::memory_order::acquire>;
-
using memory_order_release_t = ct::memory_order_constant<ct::memory_order::release>;
-
using memory_order_acq_rel_t = ct::memory_order_constant<ct::memory_order::acq_rel>;
-
Convenience aliases for specifying a memory order in atomic memory APIs.
Example
Example usage of the memory order constant aliases:
namespace ct = ::cuda::tiles; ct::atomic_load(ptr, ct::memory_order_relaxed_t{});
cuda::tiles::read_memory_order
-
template<ct::memory_order Order>
concept read_memory_order = /* atomic constraint */;
-
Indicates whether
Orderis a read memory order.
cuda::tiles::write_memory_order
-
template<ct::memory_order Order>
concept write_memory_order = /* atomic constraint */;
-
Indicates whether
Orderis a write memory order.
Thread Scope
APIs for working with thread scopes.
cuda::tiles::thread_scope
-
enum class thread_scope : /* unspecified */
-
Enumeration specifying the supported thread scopes.
cuda::tiles::thread_scope_constant
-
template<ct::thread_scope Value>
requires /* atomic constraint */
struct thread_scope_constant;
-
A specialization of
ct::thread_scope_constantencodes a thread scope value in its type. The atomic constraint validates thatValueis an enumerator ofct::thread_scope.Member Aliases
-
using type = thread_scope_constant;
-
using value_type = ct::thread_scope;
Member Variables
-
static constexpr ct::thread_scope value = Value;
Member Functions
-
__tile__ __host__ __device__ constexpr operator ct::thread_scope() const noexcept;
-
Yields Value.
-
__tile__ __host__ __device__ constexpr ct::thread_scope operator()() const noexcept;
-
Yields Value.
-
using type = thread_scope_constant;
Constant Aliases
-
using thread_scope_system_t = ct::thread_scope_constant<ct::thread_scope::system>;
-
using thread_scope_device_t = ct::thread_scope_constant<ct::thread_scope::device>;
-
using thread_scope_block_t = ct::thread_scope_constant<ct::thread_scope::block>;
-
using default_thread_scope_t = ct::thread_scope_constant<ct::default_thread_scope()>;
-
Convenience aliases for specifying a thread scope in atomic memory APIs.
Example
Example usage of the thread scope constant aliases:
namespace ct = ::cuda::tiles; ct::atomic_load(ptr, ct::memory_order_acquire_t{}, ct::thread_scope_device_t{});
cuda::tiles::default_thread_scope
-
__tile__ __host__ __device__ constexpr ct::thread_scope default_thread_scope() noexcept;
-
Yields the default thread scope for atomic memory APIs.
Footnotes