cuda.tile.matmul#
- cuda.tile.matmul(x, y, /)#
Performs matrix multiply on the given tiles.
- Parameters:
Supported input datatypes: [f16, bf16, f32, f64, tf32, f8e4m3fn, f8e5m2, i8, u8]
If x and y have different dtype, they will first be promoted to common dtype. The result dtype is the same as the promoted input types. Shape of x and y will be broadcasted to up until the last two axes.
- Return type:
Example
>>> tx = ct.full((2, 4), 3, dtype=ct.float32) >>> ty = ct.full((4, 8), 4, dtype=ct.float32) # default >>> tz = ct.matmul(tx, ty) # use builtin `@` >>> tz = tx @ ty