Core Concepts#
Nsight Python operates through three key primitives:
1. Annotations
An annotation wraps a region of code that launches GPU kernels and tags them for attribution.
Annotations can be used as decorators or context managers:
@nsight.annotate("torch")
def torch_kernel():
...
# or
with nsight.annotate("cutlass4"):
cutlass_kernel()
By default, each annotation is expected to contain a single kernel launch. For more detailed information about handling multiple kernels within an annotation, see the API documentation.
2. Kernel Analysis Decorator
Use nsight.analyze.kernel() to annotate a benchmark function. Nsight Python will rerun this function one configuration at a time. You can provide configurations in two ways:
At decoration time using the configs parameter.
At function call time by passing configs directly as an argument when invoking the decorated function.
@nsight.analyze.kernel
def benchmark(s):
...
benchmark(configs=[(1024,), (2048,)])
3. Plot Decorator
Add nsight.analyze.plot() to automatically generate plots from your profiling runs.
@nsight.analyze.plot(filename="plot.png", ylabel="Runtime (ns)")
@nsight.analyze.kernel(configs=[(1024,), (2048,)])
def benchmark(s):
...