Examples#
cutile-basic ships with example programs in the examples/ directory
demonstrating both standard BASIC and GPU tile operations.
Hello World (examples/hello.bas)#
A classic BASIC program showing variables, arithmetic, conditionals, and loops.
10 REM Hello World in BASIC
20 PRINT "Hello, World!"
30 LET X = 42.0
40 LET Y = X * 2.0
50 PRINT "X = "; X
60 PRINT "Y = "; Y
70 IF Y > 80 THEN
80 PRINT "Y is large"
90 ELSE
100 PRINT "Y is small"
110 ENDIF
120 FOR I = 1 TO 5
130 PRINT "I = "; I
140 NEXT I
150 END
Vector Add (examples/vector_add.bas)#
A GPU kernel that computes C = A + B element-wise using the block ID.
10 REM Vector Add: C = A + B
20 INPUT N, A(), B()
30 DIM A(N), B(N), C(N)
40 TILE A(128), B(128), C(128)
50 LET C(BID) = A(BID) + B(BID)
60 OUTPUT C
70 END
The INPUT statement declares A and B as kernel parameters. BID
maps to the CUDA block index, and OUTPUT marks C for host readback.
Run it end-to-end with the Python demo script:
python examples/vector_add.py
This script lexes, parses, analyzes, compiles to cubin via the bytecode backend, launches the kernel with test data, and verifies the result.
GEMM (examples/gemm.bas)#
A tiled matrix multiply: C(M,N) = A(M,K) * B(K,N).
10 REM GEMM: C(M,N) = A(M,K) * B(K,N)
15 INPUT M, N, K, A(), B()
20 DIM A(M, K), B(K, N), C(M, N)
30 TILE A(128, 32), B(32, 128), C(128, 128), ACC(128, 128)
40 LET TILEM = INT(BID / INT(N / 128))
50 LET TILEN = BID MOD INT(N / 128)
60 LET ACC = 0.0
70 FOR KI = 0 TO INT(K / 32) - 1
80 LET ACC = MMA(A(TILEM, KI), B(KI, TILEN), ACC)
90 NEXT KI
100 LET C(TILEM, TILEN) = ACC
110 OUTPUT C
120 END
DIM declares array dimensions, TILE declares the tile/partition shape for
each variable. LET ACC = 0.0 initializes the accumulator tile, MMA performs
matrix multiply-accumulate, and LET C(...) = ACC writes the result tile.
Run it with:
python examples/gemm.py
Python Demo Scripts#
Three demo scripts in examples/ show end-to-end GPU execution:
vector_add.pyCompiles
vector_add.bas, launches with 1024-element arrays, verifiesC[i] = A[i] + B[i].gemm.pyCompiles
gemm.bas, launches a 512x512 GEMM, verifies against a CuPy reference (d_a @ d_b).hello.pyCompiles
hello.basto a cubin via the bytecode backend and launches it as a single-block kernel. Becausehello.bashas no GPU extensions, this serves as a minimal smoke test of the compilation and launch pipeline.python examples/hello.py