Control Flow#
Overview#
CuTe DSL walks Python’s AST and converts each control-flow construct it finds into structured intermediate representation (IR). You can therefore write ordinary Python loops and branches while the compiler decides—statement by statement—whether to
evaluate at compile time if it’s a native Python control flow, or
emit intermediate representation (IR) when the control flow is marked as dynamic.
Passing intermediate representation (IR) values to a native Python control flow will result in an error.
For a high-level discussion of the overall pipeline, see the code-generation overview.
For Loops#
CuTe DSL recognises three kinds of ranges for for
loops:
range
– the Python built-in, always lowered to intermediate representation (IR)cutlass.range
- Same as Python built-inrange
, but supports advanced unrolling and pipelining controlcutlass.range_constexpr
– unrolled at compile time
range(…)/cutlass.range(…)#
Use when you always want a loop in the generated intermediate representation (IR), even if the inputs are Python values.
cutlass.range_constexpr(…)#
Runs in the Python interpreter and is fully unrolled before code generation. All loop indices must be Constexpr (compile-time Python value).
Example:
@cute.jit
def control_flow_examples(bound: cutlass.Int32):
n = 10
# ✅ This loop is Python loop, evaluated at compile time.
for i in cutlass.range_constexpr(n):
cute.printf("%d\\n", i)
# ✅ This loop is dynamic, even when bound is Python value.
for i in range(n):
cute.printf("%d\\n", i)
# ❌ This loop bound is a dynamic value, not allowed in Python loop.
# Should use `range` instead.
for i in cutlass.range_constexpr(bound):
cute.printf("%d\\n", i)
# ✅ This loop is dynamic, emitted IR loop.
for i in range(bound):
cute.printf("%d\\n", i)
# ✅ This loop is dynamic, emitted IR loop with unrolling
for i in cutlass.range(bound, unroll=2):
cute.printf("%d\\n", i)
If-Else Statements#
Standard Python if
/elif
/else
is supported.
Predicate without annotation → lowered to intermediate representation (IR).
Predicate annotated with `cutlass.const_expr` → evaluated at compile time.
Example:
@cute.jit
def main(const_var: cutlass.Constexpr, dynamic_var: cutlass.Int32):
# ✅ This branch is Python branch, evaluated at compile time.
if cutlass.const_expr(const_var):
cute.printf("Const branch\\n")
else:
cute.printf("Const else\\n")
# ✅ This branch is dynamic branch, emitted IR branch.
if dynamic_var == 10:
cute.printf("Dynamic True\\n")
else:
cute.printf("Dynamic False\\n")
# ❌ Using a dynamic value with `cutlass.const_expr` is not allowed.
if cutlass.const_expr(dynamic_var == 10):
cute.printf("Bound is 10\\n")
While Loops#
Standard Python while
is supported.
Condition without annotation → lowered to intermediate representation (IR).
Condition annotated with `cutlass.const_expr` → evaluated at compile time.
Example:
@cute.jit
def main(dynamic_var: cutlass.Int32):
n = 0
# ✅ This is Python while loop, evaluated at compile time.
while cutlass.const_expr(n < 10):
cute.printf("Const branch\\n")
n += 1
# ✅ This is dynamic while loop, emitted IR while loop.
while dynamic_var == 10:
cute.printf("Dynamic True\\n")
n += 1
# ❌ Using a dynamic value with `cutlass.const_expr` is not allowed.
while cutlass.const_expr(n < dynamic_var):
n += 1
Compile-Time Metaprogramming#
Mix compile-time constructs with normal CuTe DSL code to generate specialised kernels without runtime overhead. A compile-time flag can, for example, toggle an optional ReLU epilogue:
@cute.kernel
def gemm(..., do_relu: cutlass.Constexpr):
# main GEMM work
...
if cutlass.const_expr(do_relu): # compile-time guard
# ReLU code is emitted only when do_relu is True
...
gemm(..., False) # ReLU is omitted from the generated |IR|
gemm(..., True) # ReLU is included
Limitations of Dynamic Control Flow#
Early-exit
break
,continue
,pass
or raising exception from control flow body are not yet supported.Operations in the control flow body are traced only when tracing is active in that region.
Values originating in control flow body are not available outside the control flow.
Changing type of a variable in control flow body is not allowed.
Example:
@cute.jit
def control_flow_negative_examples(predicate: cutlass.Boolean):
n = 10
# ❌ This loop is dynamic, early-exit isn't allowed.
for i in cutlass.range_dynamic(n):
if i == 5:
break # Early-exit
if predicate:
val = 10
# ❌ return from control flow body is not allowed.
return
# ❌ Raising exception from control flow body is not allowed.
raise ValueError("This is not allowed")
# ❌ Using pass in control flow body is not allowed.
pass
# ❌ val is not available outside the dynamic if
cute.printf("%d\\n", val)
if predicate:
# ❌ Changing type of a variable in control flow body is not allowed.
n = 10.0