Getting Started#

Prerequisites and quick start guides for Nsight Operator.

Prerequisites#

Client Tools#

Install these on the machine you will run the profiling CLI from:

Cluster Requirements#

  • Kubernetes v1.19+ with the admissionregistration.k8s.io/v1 API enabled. Verify with:

    kubectl api-versions | grep admissionregistration.k8s.io/v1
    

    The output should be:

    admissionregistration.k8s.io/v1
    
  • kubectl access to the target cluster. Cluster-admin privileges are required for the initial operator install; namespace-admin rights are sufficient for per-tenant configuration in multi-tenant mode.

  • Node architecture: x86_64 or aarch64 (SBSA).

  • Container runtime: containerd, cri-o, or Docker. NVIDIA Container Runtime (runtimeClassName: nvidia) is required for GPU workloads and for Nsight Streamer GPU hardware acceleration.

  • NVIDIA GPU driver installed on nodes that will run profiled GPU workloads, or the NVIDIA GPU Operator managing drivers.

  • Node admin privileges: the operator sets kernel.perf_event_paranoid on worker nodes via a DaemonSet. If your cluster manages node sysctls declaratively (for example, OpenShift MachineConfig), set machineConfig: null in the Helm values and manage the sysctl yourself. See OpenShift.

  • Admission controllers: MutatingAdmissionWebhook and ValidatingAdmissionWebhook must be enabled in the correct order in the --enable-admission-plugins flag of kube-apiserver. This is the default on EKS, AKS, OKE, and GKE; self-managed clusters may need to update it. See the Kubernetes documentation.

Optional Prerequisites#

  • cert-manager if you plan to use TLS on the gateway with issuer-managed certificates (see TLS Configuration).

  • Gateway API CRDs (installed automatically by the operator by default) if you plan to expose Nsight Streamer externally via the STUNner TURN gateway. See STUNner TURN Gateway.

Quickstart Example#

This example shows the default on-demand profiling workflow: install Nsight Operator, label a workload, start and stop a profiling collection, and view or download the results.

1. Install the Nsight Operator#

This step should be done by a user with cluster admin access.

# Install the Nsight Operator with custom values and wait for it to be ready
helm install --wait \
    --namespace nsight-operator \
    --create-namespace \
    nsight-operator \
    https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz

2. Label your workload for profiling#

This step should be done by a user with kubectl access to the target cluster resources. You can use other methods to define workloads for profiling (e.g. pod name, namespace, etc.). See Advanced Configuration for more information.

# Example for a deployment named "my-deployment"
kubectl patch deployment my-deployment -p '{"spec":{"template":{"metadata":{"labels":{"nvidia-nsight-profile":"enabled"}}}}}'
# Wait for the deployment to be ready
kubectl rollout status deployment/my-deployment

# Example for a statefulset named "my-statefulset"
kubectl patch statefulset my-statefulset -p '{"spec":{"template":{"metadata":{"labels":{"nvidia-nsight-profile":"enabled"}}}}}'
# Wait for the statefulset to be ready
kubectl rollout status statefulset/my-statefulset

3. Configure the CLI#

The Nsight Operator deploys an Envoy gateway to provide HTTP access to Nsight Operator’s services. The CLI must be configured to connect to this gateway before profiling commands can be used. Credentials and connection details are stored in ~/.nsight-cloud.conf.

Recommended – autoconfigure from cluster (requires kubectl access):

The CLI can be directly configured from the Operator if you have kubectl access locally to the namespace where the operator is installed:

python3 nsight_operator.py autoconfigure -n <namespace>

This automatically discovers the gateway URL, authentication mechanism, and storage configuration from the cluster. If the Gateway is deployed as a ClusterIP service, port-forwarding to the gateway is handled automatically when invoking CLI commands.

Alternative – manual configure (when kubectl access is not available):

Note

If you expose the gateway outside of the cluster (e.g. via LoadBalancer or NodePort), it is highly recommended to configure authentication. See Gateway Authentication for details.

If the gateway is exposed outside the cluster (e.g. via LoadBalancer or NodePort), you can configure the CLI manually:

# No authentication (default)
python3 nsight_operator.py configure --gw http://<gateway-ip>:8888

# With API key authentication -- read from an env var rather than a
# shell argument so the key is not saved in shell history.
export NSIGHT_API_KEY=...  # paste or load from your secret store
python3 nsight_operator.py configure --gw http://<gateway-ip>:8888 --apikey "${NSIGHT_API_KEY}"

When using a ClusterIP gateway, you must set up port-forwarding manually before calling configure:

kubectl port-forward -n <namespace> svc/nsight-operator-gateway 8888:8888 &
python3 nsight_operator.py configure --gw http://localhost:8888

4. Check Status#

Use the status command to confirm that your workload pods have restarted and the profiling agents are active:

Tag default:

Collection agents:
- your-app-1
- your-app-2
No Active Session

Note

A tag is a label (default: default) that partitions agents and sessions on a single coordinator so that multiple disjoint profiling workflows can coexist. All CLI commands accept --tag <name> to target a specific tag. See Concepts and Terminology.

5. Start Profiling#

Use profiler-start to begin collection:

python3 nsight_operator.py profiler-start

Output:

Profiling started for Session=00000000-1111-2222-3333-444444444444 Collection=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.

6. Profile your workload#

Allow your workload to run for the desired profiling duration.

7. Stop Profiling#

python3 nsight_operator.py profiler-stop

Output:

Profiling stopped for Session=00000000-1111-2222-3333-444444444444 Collection=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.

8. Repeat profiling if needed#

Repeat steps 4-7 if you need to profile more than once. Move to the next step if you are done with profiling.

9. List Profiling Results#

# List available profiling results
python3 nsight_operator.py ls

Output:

Connecting...
Session 00000000-0000-0000-0000-000000000000 active in state Active
Service default
+-- Session 00000000-0000-0000-0000-000000000000
    +-- your-app-1
    |   +-- File 0: Report 0 (Path: 00000000-0000-0000-0000-000000000000/0/default-your-app-1_63613cca.nsys-rep)
    +-- your-app-2
        +-- File 1: Report 0 (Path: 00000000-0000-0000-0000-000000000000/0/default-your-app-2_3f95aeda.nsys-rep)
[Active]

The generated profiling results are uploaded from the pods to the storage service. The reports are not displayed until the upload is complete. If the results are not yet available, wait for a few seconds and try again.

10. (Optional) Conduct analysis with recipes#

Optionally, you can run Nsight Systems Recipes on collected profiles. For the full list of available recipes and API details, see Analysis Guide.

You can either use the Graphical User Interface or the Command Line Interface (as shown below) to run recipes.

Run a recipe, which launches a new job and returns its ID:

python3 nsight_operator.py analysis run cuda_api_sum

Output:

Session not specified, using current session: 00000000-0000-0000-0000-000000000000
Analysis started successfully.
Report ID: 1

Check the status of the recipe job:

python3 nsight_operator.py analysis reports info <report-id>

Output:

{
  "id": 1,
  "name": "cuda_api_sum",
  "status": "SUCCESS",
  "data": "..."
}

Download the output of the report:

python3 nsight_operator.py analysis reports download <report-id>

11. View Results with Nsight Streamer#

Instead of downloading large report files, you can view profiling results directly in your browser using Nsight Streamer. Deploy a streamer instance by applying a NsightStreamer CR:

apiVersion: nvidia.com/v1alpha1
kind: NsightStreamer
metadata:
  name: nsight-viewer
kubectl apply -f nsight-streamer.yaml

# Forward the streamer's HTTP port; WebRTC TURN media flows through
# the cluster-wide STUNner gateway automatically.
kubectl port-forward svc/nsight-viewer-service 30080:30080

Then open http://localhost:30080 in your browser (default credentials: nvidia / nvidia) to analyze reports without transferring files from the cluster. See Nsight Streamer for full configuration options, and STUNner TURN Gateway for how the shared TURN LoadBalancer is exposed.

12. Download Profiling Results (Alternative)#

You can download reports to your local machine for analysis with the Nsight Systems desktop application.

Note

If you did not use autoconfigure in step 3, you must configure storage access first. See the instructions in Configuring Storage Access for Downloads.

# Download profiling results to a local directory
python3 nsight_operator.py download --output-dir /tmp/nsight_profile_reports

Files are organized by session title and collection ID:

/tmp/nsight_profile_reports/
  20260310120000-my-profiling-session/
    01234567-8900-abcd-abcd-000000000000/
      your-app-1_default_your-app-1_63613cca.nsys-rep
      your-app-2_default_your-app-2_3f95aeda.nsys-rep

Tip

Give your sessions descriptive titles with session-begin --title "my profiling session" to make downloaded results easy to identify.

13. End the Session#

python3 nsight_operator.py session-end

Output:

Session 00000000-0000-0000-0000-000000000000 ended.

Quickstart Example (Multitenant Mode)#

Multitenant Mode is designed for environments where multiple teams or namespaces need independent profiling capabilities. In this mode (all or some of the roles below can be performed by the same person):

  • Cluster Admin: Installs the Nsight Operator in multitenant mode cluster-wide.

  • Namespace Admin: Configures profiling rules in their namespace.

  • Profiling User: Runs profiling sessions using the connection credentials provided by the namespace admin.

This separation allows namespace admins to manage their own profiling infrastructure without cluster-wide privileges. Each tenant coordinator automatically generates cryptographic keys for secure authentication.

The diagram below shows which resources each persona owns:

Persona

Owns

Uses

Cluster Admin

Nsight Operator Helm release, cluster-wide controller, and injector webhook

Helm

Namespace Admin

NsightCoordinator, NsightCloudStorageConfig, NsightGateway, NsightOperatorProfileConfig, and optional NsightAnalysis, NsightOtelCollector, and OTLPProxyConfig resources

kubectl

Profiling User

Workloads labeled nvidia-nsight-profile=enabled

nsight_operator.py commands such as profiler-start, profiler-stop, ls, download, analysis, session-begin, and session-end

With auto-provisioning (default), the operator creates the per-tenant CRs automatically the first time a matching Pod is admitted; the namespace admin’s role is reduced to labelling Pods and writing a NsightOperatorProfileConfig (or letting the default rule match).

Choose Your Approach:

Approach

Description

Best For

Without Auto-Provisioning

Namespace admin manually deploys tenant infrastructure

Full control over infrastructure, custom configurations

With Auto-Provisioning

Infrastructure auto-created when pods need profiling

Simplified setup, consistent configuration across tenants

1. Cluster Admin – Install Operator#

Choose one of the following installation methods:

Without Auto-Provisioning (namespace admins will deploy their own infrastructure):

helm install --wait \
    --namespace nsight-operator \
    --create-namespace \
    nsight-operator \
    --set installation.multitenant=true \
    --set nsight-injector.nsightToolConfig.enableDefault=false \
    --set nsight-coordinator.enabled=false \
    --set cloudStorage.enabled=false \
    --set nsight-gateway.enabled=false \
    --set nsight-analysis.enabled=false \
    --set nsight-otel-collector.enabled=false \
    --set otlpProxyConfig.enabled=false \
    --set nsight-cloud-ui.enabled=false \
    --set nsight-tenant-operator.enabled=false \
    --set nsight-tenant-operator.stunner.enabled=false \
    https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz

With Auto-Provisioning (infrastructure created automatically per namespace):

helm install --wait \
    --namespace nsight-operator \
    --create-namespace \
    nsight-operator \
    --set installation.multitenant=true \
    https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz

When auto-provisioning is enabled, the following resources are automatically created in tenant namespaces when pods requiring profiling are detected:

  • NsightCoordinator – Coordinator deployment for profiling control (created if nsight-coordinator.enabled=true in operator values, default: true)

  • NsightCloudStorageConfig – MinIO/S3 storage for profiling results (created if cloudStorage.enabled=true in operator values, default: true)

  • NsightGateway – Envoy gateway for REST API access (created if nsight-gateway.enabled=true in operator values, default: true)

  • NsightOtelCollector – OTLP collector infrastructure (created if nsight-otel-collector.enabled=true in operator values, default: true)

  • OTLPProxyConfig – OTLP proxy injection configuration (created if otlpProxyConfig.enabled=true in operator values, default: true)

  • NsightAnalysis – Analysis service for running recipes (created if nsight-analysis.enabled=true in operator values, default: true)

  • NsightCloudUI – Browser UI for sessions, collections, and analysis jobs (created if nsight-cloud-ui.enabled=true in operator values, default: true)

  • NsightTenantOperator – Tenant API used by Nsight Cloud UI to manage streamers (created if nsight-tenant-operator.enabled=true in operator values, default: true)

Note

Namespace admins can pre-deploy their own infrastructure (NsightCoordinator, NsightCloudStorageConfig, NsightGateway, etc.) before any pods are created. Auto-provisioning will detect existing resources and skip creating them, allowing custom configurations to take priority over defaults.

2. Namespace Admin – Setup Tenant Namespace#

2.1 Create the namespace (if it does not exist)#

kubectl create namespace my-team-ns

2.2 Deploy namespace infrastructure (without auto-provisioning only)#

Note

Skip this step if using Auto-Provisioning – the infrastructure will be created automatically. However, you can optionally pre-deploy your own resources with custom configuration, and auto-provisioning will detect and use them instead of creating new ones.

Note

Only one NsightCoordinator is supported per namespace. Profile configs in the namespace will automatically discover and use it.

Deploy the NsightCoordinator, NsightCloudStorageConfig, NsightAnalysis, and NsightGateway CRs. The example below uses a ClusterIP gateway service (reach it via kubectl port-forward; see Gateway Service Configuration below for a LoadBalancer variant) and MinIO with ephemeral storage:

Warning

The example below uses MinIO’s ephemeral storage for simplicity – if the MinIO Pod restarts or is rescheduled, all profiling reports are lost. Use this layout only for quickstart / demo environments. For anything beyond that, enable persistent MinIO or point NsightCloudStorageConfig at an external S3-compatible backend. See Storage Configuration.

# coordinator.yaml
apiVersion: nvidia.com/v1alpha1
kind: NsightCoordinator
metadata:
  name: nsight-coordinator
  namespace: my-team-ns
spec:
  restAPI:
    enabled: true
---
# cloud-storage.yaml
apiVersion: nvidia.com/v1alpha1
kind: NsightCloudStorageConfig
metadata:
  name: nsight-cloud-storage
  namespace: my-team-ns
spec:
  enabled: true
  minio:
    enabled: true
---
# analysis.yaml
apiVersion: nvidia.com/v1alpha1
kind: NsightAnalysis
metadata:
  name: nsight-analysis
  namespace: my-team-ns
---
# gateway.yaml
apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
  name: nsight-operator-gateway
  namespace: my-team-ns
spec:
  service:
    type: ClusterIP
  cloudStorageRef:
    name: nsight-cloud-storage

Apply the configuration:

kubectl apply -f coordinator.yaml -f cloud-storage.yaml -f analysis.yaml -f gateway.yaml -n my-team-ns

Wait for the gateway to be ready:

kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=gateway -n my-team-ns --timeout=120s

2.3 Deploy the NsightOperatorProfileConfig#

Create a profile config to define which pods should be profiled. The coordinator in the same namespace is automatically discovered, so no explicit reference is needed:

apiVersion: nvidia.com/v1
kind: NsightOperatorProfileConfig
metadata:
  name: tenant-profile-config
  namespace: my-team-ns
spec:
  defaultNsightToolConfigRef: "tenant-nsight-tool-config"
  nsightToolConfigs:
    - name: "tenant-nsight-tool-config"
      coordinator: true
      nsightToolArgs: "--python-sampling=true"
      injectionIncludePatterns:
        - ".*"
  injectionRules:
    - name: "profile-labeled-pods"
      objectSelector:
        matchLabels:
          nvidia-nsight-profile: enabled

Apply the configuration:

kubectl apply -f tenant-profile-config.yaml

3. Profile Your Workload#

3.1 Label workloads for profiling#

# Label a deployment for profiling
kubectl patch deployment my-app -n my-team-ns -p '{"spec":{"template":{"metadata":{"labels":{"nvidia-nsight-profile":"enabled"}}}}}'

# Wait for rollout
kubectl rollout status deployment/my-app -n my-team-ns

Note

If using Auto-Provisioning, the coordinator, storage, and gateway infrastructure will be created automatically when the first matching pod is deployed.

3.2 Configure the CLI#

The CLI must be configured to connect to the gateway before profiling commands can be used. Credentials and connection details are stored in ~/.nsight-cloud.conf.

Note

If you plan to expose the gateway outside the cluster (e.g. via LoadBalancer or NodePort), it is highly recommended to configure authentication. See Gateway Authentication for details.

With Auto-Provisioning (wait for gateway to be ready first):

# Wait for auto-provisioned gateway to be ready
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=gateway -n my-team-ns --timeout=120s

Recommended – autoconfigure from cluster (requires kubectl access):

python3 nsight_operator.py autoconfigure -n my-team-ns

This automatically discovers the gateway URL, authentication mechanism, and storage configuration. For ClusterIP services, subsequent CLI commands will automatically set up port-forwarding.

Alternative – manual configure (when kubectl access is not available):

# No authentication (default)
python3 nsight_operator.py configure --gw http://<gateway-ip>:8888

# With API key authentication -- read from an env var rather than a
# shell argument so the key is not saved in shell history.
export NSIGHT_API_KEY=...  # paste or load from your secret store
python3 nsight_operator.py configure --gw http://<gateway-ip>:8888 --apikey "${NSIGHT_API_KEY}"

Note

When using configure with a ClusterIP gateway, you must set up port-forwarding manually:

kubectl port-forward -n my-team-ns svc/nsight-operator-gateway 8888:8888 &
python3 nsight_operator.py configure --gw http://localhost:8888

3.3 Start and stop profiling#

# Start profiling
python3 nsight_operator.py profiler-start

# Run your workload...

# Stop profiling
python3 nsight_operator.py profiler-stop

3.4 View results#

# List results
python3 nsight_operator.py ls

3.5 Post-capture workflow#

From here the workflow is the same as single-tenant mode – the CLI, Analysis service, and Nsight Streamer all live behind the per-namespace gateway you pointed the CLI at. Use the commands below as a quick reference; Quickstart Example steps 10-12 walk through the same flow in detail.

# Run an analysis recipe (optional); see /AnalysisGuide/index
python3 nsight_operator.py analysis run cuda_api_sum
python3 nsight_operator.py analysis reports download <report-id>

# View a report in the browser via Nsight Streamer; see /NsightStreamer/index
kubectl apply -n my-team-ns -f nsight-streamer.yaml
kubectl port-forward -n my-team-ns svc/nsight-viewer-service 30080:30080

# Or download .nsys-rep files locally
python3 nsight_operator.py download --output-dir /tmp/nsight_profiles

# End the session when done
python3 nsight_operator.py session-end

Note

If you used configure (rather than autoconfigure) in step 3.2, set up storage access before download – see Configuring Storage Access for Downloads.

Cleanup#

Namespace admin – Remove profile config and infrastructure:

# Remove profile config
kubectl delete nsightoperatorprofileconfig tenant-profile-config -n my-team-ns

kubectl delete nsightgateway nsight-operator-gateway -n my-team-ns
kubectl delete nsightcoordinator nsight-coordinator -n my-team-ns
kubectl delete nsightcloudstorageconfig nsight-cloud-storage -n my-team-ns
kubectl delete nsightanalysis nsight-analysis -n my-team-ns
kubectl delete nsightotelcollector nsight-operator-otel-collector -n my-team-ns
kubectl delete otlpproxyconfig nsight-otlp-proxy-config -n my-team-ns
kubectl delete nsightcloudui nsight-operator-cloud-ui -n my-team-ns
kubectl delete nsighttenantoperator nsight-operator-tenant-operator -n my-team-ns

Cluster admin – Uninstall operator (only if no longer needed by any tenant):

helm uninstall nsight-operator -n nsight-operator

Gateway Service Configuration#

These options apply when not using Auto-Provisioning and namespace admins deploy their own gateway via NsightGateway CR. The gateway provides the HTTP entry point for accessing Coordinator and Analysis services.

LoadBalancer Service#

For direct external access (if LoadBalancer enabled in the cluster):

apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
  name: nsight-operator-gateway
  namespace: my-team-ns
spec:
  service:
    type: LoadBalancer
  cloudStorageRef:
    name: nsight-cloud-storage

Connect using the external IP:

export GATEWAY_IP=$(kubectl get svc nsight-operator-gateway -n my-team-ns -o jsonpath='{.status.loadBalancer.ingress[0].ip}')
python3 nsight_operator.py configure --gw http://${GATEWAY_IP}:8888

ClusterIP with Port-Forward#

For local development or when external access is not needed:

apiVersion: nvidia.com/v1alpha1
kind: NsightGateway
metadata:
  name: nsight-operator-gateway
  namespace: my-team-ns
spec:
  service:
    type: ClusterIP
  cloudStorageRef:
    name: nsight-cloud-storage

Connect using port-forward:

kubectl port-forward -n my-team-ns svc/nsight-operator-gateway 8888:8888 &
python3 nsight_operator.py configure --gw http://localhost:8888

Quick Start Example (NVIDIA Dynamo)#

This example shows how to profile NVIDIA Dynamo deployments using the helper script nsight_operator_dynamo.py.

The script supports two deployment modes:

Mode

Use Case

Setup Command

Single-tenant

One team owns the cluster, or quick standalone setup

install

Multi-tenant

Multiple teams share a cluster; each team configures their own namespace

configure (after cluster admin installs operator)

Note

Profiling injection only applies to pods created after the operator/configuration is applied. Existing pods must be restarted.

1. Setup Profiling#

Choose one of the following options based on your environment:

Option A: Single-Tenant Mode (install)#

Use this when you have cluster-admin access and want a standalone setup. This installs the Nsight Operator and configures Dynamo profiling in one step.

# Install for all Dynamo deployments with GPU metrics
python3 nsight_operator_dynamo.py -n nsight-operator install --gpu-metrics

Or target a specific deployment with custom options:

python3 nsight_operator_dynamo.py -n nsight-operator install \
    --deployment-name trtllm-disagg \
    --components Frontend Backend \
    --nsys-profile-args "-t cuda,nvtx,osrt --python-sampling=true" \
    --gpu-metrics

Option B: Multi-Tenant Mode (configure)#

Use this when a cluster admin has already installed the Nsight Operator in multitenant mode, and you want to configure profiling in your namespace.

Cluster Admin (one-time setup):

helm install --wait \
    --namespace nsight-operator \
    --create-namespace \
    nsight-operator \
    --set installation.multitenant=true \
    https://helm.ngc.nvidia.com/nvidia/devtools/charts/nsight-operator-26.2.1.tgz

Namespace User:

# Configure profiling in your namespace (namespace must already exist)
python3 nsight_operator_dynamo.py -n my-dynamo-ns configure --gpu-metrics

Or target a specific deployment:

python3 nsight_operator_dynamo.py -n my-dynamo-ns configure \
    --deployment-name trtllm-disagg \
    --components Frontend Backend \
    --nsys-profile-args "-t cuda,nvtx,osrt --python-sampling=true" \
    --gpu-metrics

2. Configure the CLI#

Wait for the gateway to be ready and configure the CLI:

# Wait for the gateway to be ready
kubectl wait --for=condition=ready pod -l app.kubernetes.io/component=gateway -n <namespace> --timeout=120s

Recommended – autoconfigure from cluster (requires kubectl access):

python3 nsight_operator.py autoconfigure -n <namespace>

This automatically discovers the gateway URL, authentication mechanism, and storage configuration. For ClusterIP services, subsequent CLI commands will automatically set up port-forwarding.

Alternative – manual configure (when kubectl access is not available):

# Set up port-forward for ClusterIP gateway
kubectl port-forward -n <namespace> svc/nsight-operator-gateway 8888:8888 &

# Configure the CLI
python3 nsight_operator.py configure --gw http://localhost:8888

3. Deploy or Restart Dynamo Workloads#

Restart existing deployments to enable profiling injection:

# Using rollout restart
kubectl rollout restart deployment/trtllm-disagg-frontend -n <namespace>
kubectl rollout restart deployment/trtllm-disagg-backend -n <namespace>

# Or using scale down/up
kubectl scale deployment/trtllm-disagg-frontend --replicas=0 -n <namespace>
kubectl scale deployment/trtllm-disagg-frontend --replicas=1 -n <namespace>
kubectl scale deployment/trtllm-disagg-backend --replicas=0 -n <namespace>
kubectl scale deployment/trtllm-disagg-backend --replicas=1 -n <namespace>

4. Start Profiling#

python3 nsight_operator.py profiler-start

Output:

Profiling started for Session=e85d2a48-3284-4cfe-984a-20b0fa7d1673 Collection=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.

5. Run Your Workload#

Execute the workload you want to profile.

6. Stop Profiling#

python3 nsight_operator.py profiler-stop

Output:

Profiling stopped for Session=e85d2a48-3284-4cfe-984a-20b0fa7d1673 Collection=aaaaaaaa-bbbb-cccc-dddd-eeeeeeeeeeee.

7. View Results#

# List available results
python3 nsight_operator.py ls

8. Post-capture workflow#

From here the workflow is identical to the generic coordinator-mode walkthrough; use the commands below as a quick reference and see Quickstart Example steps 10-12 for the detailed walk-through.

# Run an analysis recipe (optional); see /AnalysisGuide/index
python3 nsight_operator.py analysis run cuda_api_sum
python3 nsight_operator.py analysis reports download <report-id>

# View a report in the browser via Nsight Streamer; see /NsightStreamer/index
kubectl apply -n my-dynamo-ns -f nsight-streamer.yaml
kubectl port-forward -n my-dynamo-ns svc/nsight-viewer-service 30080:30080

# Or download .nsys-rep files locally
python3 nsight_operator.py download --output-dir /tmp/nsight_profile_reports

# End the session when done
python3 nsight_operator.py session-end

Note

If you used configure (rather than autoconfigure) in step 2, set up storage access before download – see Configuring Storage Access for Downloads.

9. Cleanup (Optional)#

Single-tenant mode:

python3 nsight_operator_dynamo.py -n nsight-operator uninstall

Multi-tenant mode:

python3 nsight_operator_dynamo.py -n my-dynamo-ns unconfigure

Command Reference#

Command

Mode

Description

install

Single-tenant

Install Nsight Operator with Dynamo profiling configuration

uninstall

Single-tenant

Uninstall Nsight Operator

configure

Multi-tenant

Configure Dynamo profiling in a namespace (creates NsightOperatorProfileConfig)

unconfigure

Multi-tenant

Remove Dynamo profiling configuration from a namespace

setup-env

Both

Download connection environment for profiling control

status

Both

Check if Nsight Operator is installed

download

Both

Download profiling results

Common Options#

Option

Description

-n, --namespace

Target namespace (default: nsight-operator)

--deployment-name

Specific Dynamo deployment name to profile

--components

Specific Dynamo components to profile (e.g., Frontend Backend)

--nsys-profile-args

Override default Nsight Systems arguments (default: -t cuda,nvtx,osrt --cuda-graph-trace=node --python-sampling=true --pytorch=autograd-nvtx)

--gpu-metrics

Install GPU metrics DaemonSet alongside profiling

-f, --values

Additional Helm values file (install only)

Notes#

  • You can scope profiling to specific Dynamo components using --components (e.g., Frontend, Backend).

  • Use --nsys-profile-args to override the default Nsight Systems arguments.

  • For more precise control over profiling (environment variables, volumes, etc.), see Installation and Configuration.