Release Notes#
1.2.0 (2026-03-05)#
CTK 13.2 features#
Support Ampere and Ada (sm80 family) GPUs.
Support
pip install cuda-tile[tileiras]to usetileirasfrom Python environment without system-wide CTK installation.Add
ct.atan2(y, x)operation for computing the arctangent of y/x.Add optional
rounding_modeparameter forct.tanh(), supportingRoundingMode.FULLandRoundingMode.APPROX.Compiling FP8 operations for sm80 family GPUs will raise
TileUnsupportedFeatureError.Setting
opt_level=0onct.kernelis no longer required forct.printf()andct.print().
Features#
Add
ct.static_iterkeyword that enables compile-timeforloops.Add
ct.static_assertkeyword that can be used to assert that a condition is true at compile time.Add
ct.static_evalkeyword that enables compile-time evaluation using the host Python interpreter.Add
ct.scan()for custom scan.Add
ct.isnan().Add
print()andct.print()that supports python-style print and f-strings.Add optional
maskparameter toct.gather()andct.scatter()for custom boolean masking.Operator
+can now be used to concatenate tuples.Support unpacking nested tuples (e.g.,
a, (b, c) = t) and using square brackets for unpacking (e.g.,[a, b] = 1, 2).Add bytecode-to-cubin disk cache to avoid recompilation of unchanged kernels. Controlled by
CUDA_TILE_CACHE_DIRandCUDA_TILE_CACHE_SIZE.
Bug Fixes#
Fix a bug where
nan != nanreturns False.Fix “potentially undefined variable
$retval” error when a helper function returns after awhileloop that contains no early return.Fix the missing column indicator in error messages when the underlined text is only one character wide.
Add a missing check for unpacking a tuple with too many values. For example,
a, b = 1, 2, 3now raises an error instead of silently discarding the extra value.Fix a bug where the promoted dtype of uint16 and uint64 was incorrectly set to uint32.
Enhancements#
Erase the distinction between scalars and zero-dimensional tiles. They are now completely interchangeable.
~xfor const booleanxwill raise a TypeError to prevent inconsistent results compared to~xon a boolean Tile.Add
TileUnsupportedFeatureErrorto the public API.
1.1.0 (2026-01-30)#
Features#
Add support for nested functions and lambdas.
Add support for custom reduction via
ct.reduce().Add
Array.slice(axis, start, stop)to create a view of an array sliced along a single axis. The result shares memory with the original array (no data copy).
Bug Fixes#
Fix reductions with multiple axes specified in non-increasing order.
Fix a bug when pattern matching (FusedMultiplyAdd) attempts to remove a value that is used by the new operation.
Enhancements#
Allow assignments with type annotations. Type annotations are ignored.
Support constructors of built-in numeric types (bool, int, float), e.g.,
float('inf').Lift the ban on recursive helper function calls. Instead, add a limit on recursion depth. Add a new exception class
TileRecursionError, thrown at compile time when the recursion limit is reached during function call inlining.Improve error messages for type mismatches in control flow statements.
Relax type checking rules for variables that are assigned a different type depending on the branch taken: it is now only an error if the variable is used afterwards.
Stricter rules for potentially-undefined variable detection: if a variable is first assigned inside a
forloop, and then used after the loop, it is now an error because the loop may take zero iterations, resulting in a use of an undefined variable.Include a full cuTile traceback in error messages. Improve formatting of code locations; include function names, remove unnecessary characters to reduce line lengths.
Delay the loading of CUDA driver until kernel launch.
Expose the
TileErrorbase class in the public API.Add
ct.abs()for completeness.
1.0.1 (2025-12-18)#
Bug Fixes#
Fix a bug in hash function that resulted in potential performance regression for kernels with many specializations.
Fix a bug where an if statement within a loop can trigger an internal compiler error.
Fix SliceType
__eq__comparison logic.
Enhancements#
Improve error message for
ct.cat().Support
is not Nonecomparison.
1.0.0 (2025-12-02)#
Initial release.