Examples#

cutile-basic ships with example programs in the examples/ directory demonstrating both standard BASIC and GPU tile operations.

Hello World (`examples/hello.bas`)#

A classic BASIC program showing variables, arithmetic, conditionals, and loops.

REM Hello World in BASIC
PRINT "Hello, World!"
LET X = 42.0
LET Y = X * 2.0
PRINT "X = "; X
PRINT "Y = "; Y
IF Y > 80 THEN
 PRINT "Y is large"
ELSE
PRINT "Y is small"
ENDIF
FOR I = 1 TO 5
 PRINT "I = "; I
NEXT I
END

Vector Add (`examples/vector_add.bas`)#

A GPU kernel that computes C = A + B element-wise using the block ID.

REM Vector Add: C = A + B
INPUT N, A(), B()
DIM A(N), B(N), C(N)
TILE A(128), B(128), C(128)
LET C(BID) = A(BID) + B(BID)
OUTPUT C
END

The INPUT statement declares A and B as kernel parameters. BID maps to the CUDA block index, and OUTPUT marks C for host readback.

Run it end-to-end with the Python demo script:

python examples/vector_add.py

This script lexes, parses, analyzes, compiles to cubin via the bytecode backend, launches the kernel with test data, and verifies the result.

GEMM (`examples/gemm.bas`)#

A tiled matrix multiply: C(M,N) = A(M,K) * B(K,N).

REM GEMM: C(M,N) = A(M,K) * B(K,N)
INPUT M, N, K, A(), B()
DIM A(M, K), B(K, N), C(M, N)
TILE A(128, 32), B(32, 128), C(128, 128), ACC(128, 128)
LET TILEM = INT(BID / INT(N / 128))
LET TILEN = BID MOD INT(N / 128)
LET ACC = 0.0
FOR KI = 0 TO INT(K / 32) - 1
 LET ACC = MMA(A(TILEM, KI), B(KI, TILEN), ACC)
NEXT KI
LET C(TILEM, TILEN) = ACC
OUTPUT C
END

DIM declares array dimensions, TILE declares the tile/partition shape for each variable. LET ACC = 0.0 initializes the accumulator tile, MMA performs matrix multiply-accumulate, and LET C(...) = ACC writes the result tile.

Run it with:

python examples/gemm.py

Python Demo Scripts#

Three demo scripts in examples/ show end-to-end GPU execution:

vector_add.py

Compiles vector_add.bas, launches with 1024-element arrays, verifies C[i] = A[i] + B[i].

gemm.py

Compiles gemm.bas, launches a 512x512 GEMM, verifies against a CuPy reference (d_a @ d_b).

hello.py

Compiles hello.bas to a cubin via the bytecode backend and launches it as a single-block kernel. Because hello.bas has no GPU extensions, this serves as a minimal smoke test of the compilation and launch pipeline.

python examples/hello.py

Examples#

Hello World (examples/hello.bas)#

Vector Add (examples/vector_add.bas)#

GEMM (examples/gemm.bas)#

Python Demo Scripts#

Hello World (`examples/hello.bas`)#

Vector Add (`examples/vector_add.bas`)#

GEMM (`examples/gemm.bas`)#