Nsight Profiling
NVIDIA Nsight Operator can inject NVIDIA Nsight Systems profiling support into NVCF function pods that run on a self-hosted GPU cluster. Use it after the NVCA Operator is installed and the cluster can deploy and invoke functions.
Nsight Operator injects profiling only into newly admitted pods. To enable profiling for NVCF functions, label the workload namespace before creating or recreating the function pods.
Prerequisites
- A self-hosted NVCF control plane with a registered GPU cluster.
- A healthy NVCA Operator and NVCA agent. See Self-Managed Clusters.
- The NVIDIA GPU Operator installed on the GPU cluster.
kubectland Helm access with permissions to install cluster-scoped resources.- An external S3-compatible bucket for Nsight profiling results.
- The NVIDIA Nsight Operator resources bundle, including
nsight_operator.py, installed on the workstation that will control profiling sessions.
Install the Python dependencies from the unpacked Nsight Operator resources bundle:
Storage
Use external S3-compatible storage for repeatable NVCF profiling runs. The stock Nsight Operator chart can deploy an in-cluster MinIO instance, but that default is intended for short-lived experiments. External object storage keeps captures available after Kubernetes pod restarts and makes downloads independent of the cluster lifecycle.
Create a Kubernetes Secret in the Nsight Operator namespace. Replace the placeholder values with credentials for your storage provider.
Apply the namespace and Secret:
If your storage provider does not require endpoint_url, omit it from
storage-config.yaml.
Install Nsight Operator
The profiling-only profile enables coordinator mode for on-demand captures, uses external S3 storage, disables operator-managed MinIO, and disables OTLP trace mirroring.
Install the chart:
To also enable OTLP trace mirroring, use this values file instead:
Verify that the Nsight Operator components are running:
Enable Profiling Manually
Container functions run in the shared nvcf-backend namespace. Label that
namespace before deploying or recreating container function pods:
Helm functions run in dedicated namespaces created by NVCA. After deploying a Helm function, find the function namespace and label it:
Existing pods are not injected retroactively. Recreate function pods after adding the label. For container functions, delete the affected pod and allow NVCA to recreate it:
For Helm functions, redeploy the function or restart the workload resource in the function namespace:
Optional Kyverno Automation
Kyverno can label new NVCF workload namespaces as they are created. This covers
container functions through nvcf-backend when the namespace is created after
the policy exists, and Helm functions through their dedicated NVCA-created
namespaces.
Install Kyverno before applying this policy. Then create the ClusterPolicy:
Apply the policy:
Verify namespace labels:
If nvcf-backend or existing Helm function namespaces were created before the
policy, label them manually or use Kyverno mutate-existing features in your
cluster policy. Recreate existing function pods after the label is present.
Run a Capture
Use this flow with an existing Gemma-based LLM function, or create one using the
LLM Gateway function configuration
pattern. Keep the model name aligned with the function’s configured
models[].name value.
Set the function and invocation variables:
Deploy the function with a single GPU if it is not already deployed:
Wait until the function pod is running and has been recreated after the Nsight label was applied:
Configure the Nsight CLI. autoconfigure discovers the Nsight Gateway and
storage settings from the cluster.
Start a profiling session:
Invoke the model through the NVCF LLM route while profiling is active:
Stop and close the session:
List and download the report:
The downloaded directory should contain one or more .nsys-rep files. Open the
reports with NVIDIA Nsight Systems.
Troubleshooting
Pods are not injected
Check the namespace label:
The namespace must have nvidia-nsight-profile=enabled before the pod is
created. If the pod already existed, recreate it. Also check the Nsight Operator
logs and verify that the process name matches the configured
injectionIncludePatterns.
Downloads fail
Verify that the storage Secret exists and that the bucket credentials can write and read profiling reports:
If nsight_operator.py configure was used instead of autoconfigure, configure
storage access separately before running download.
The Nsight Gateway is not reachable
Check the Nsight Gateway service:
For a ClusterIP service, run nsight_operator.py autoconfigure -n nsight-operator
from a workstation with kubectl access. The CLI can set up port forwarding for
the gateway.
GPU metrics collectors conflict
Some GPU profiling and metrics collectors use the same low-level GPU interfaces. If profiling fails after injection, check whether another DCGM or GPU metrics collector is running on the node. Temporarily disable the conflicting collector or profile in a maintenance window.