12. Release Notes#

12.1. Known Issues#

  • The programming model is missing a section on a cross-tile block kernel such as split-k.

  • The bytecode section does not provide exact encoding of each operation, expect this to be introduced in a future release.

  • The semi-formal memory model section is written but does not provide detailed examples of how to to utilize it.

  • Atomics are currently limited in Tile IR and will be expanded in a future release.

12.2. Changelog#

12.2.1. Spec 13.2 (2026-03-11)#

Supported Architectures#

  • Added support for Ampere (sm_80, sm_86, sm_87, sm_88) and Ada (sm_89) architectures.

New Operations#

  • Added cuda_tile.atan2 operation for element-wise two-argument arctangent.

Updated Operations#

  • Added overflow attribute to cuda_tile.negi to control integer overflow behavior.

  • Added rounding_mode attribute to cuda_tile.tanh to control floating-point rounding behavior.

  • Added token result to cuda_tile.print_tko for memory ordering support.

  • Added unsignedCmp flag to cuda_tile.for to support unsigned integer comparison for loop termination.

  • Renamed cuda_tile.print to cuda_tile.print_tko in the textual format. Bytecode encoding is unchanged and remains backward compatible.